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0 Cloned genes encoding IG-CD4 fusion proteins and the use thereof. 

® Fusion proteins of immunoglobulins of the IglVI. IgGi or fgG3 class, wherein the variable region of the light or 
heavy chain has been replaced with CD4 or fragments thereof capable of binding to gp120.or immunogiobulin- 
lil<e molecules comprising such fusion proteins together with an immunoglobulin tight or heavy chain can be 
administered to an animal suffering from HIV or SIV infection. They also are useful in assays for HIV or SIV 
comprising contacting a sample suspected of containing HIV or SIV gp120 with the immunoglobulln-lilce 
molecule or fusion protein, and detecting whether a complex is formed. 
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CLONED GENES ENCODING IG^4 FUSION PROTEINS AND THE USE THEREOF 



CROSS-REFERENCE TO RELATED APPLICATION 

This application is a continuation-in-part of U.S. Application Serial No. 07/147.351 filed January 22, 
1988. 

5 

FIELD OF THE INVENTION 

The invention is In the field of recombinant genetics. 

10 

BACKGROUND OF THE INVENTION 

The human and simian immunodeficiency viruses HIV and StV are the causative agents of Acquired 

75 Immune Deficiency Syndrome (AIDS) and Simian Immunodeficiency Syndrome (SIDS). respectively. See 
Curren, J. et aL, Science 3^:1359-1357 (1985); Weiss J^, et al, Nature 324:572-575 (1986). The HIV virus 
contains an envelope glycoprotein. gp120 which binds to the CD4 protein present on the surface of helper T 
lymphocytes, macrophages and other cells. Dalgleish et aL Nature , 312:763 (1984). After the gp120 binds to 
CD4. virus entry is facilitated by an envelope-mediated fusion of the viral target ceil membranes. 

20 During the course of infection, the host organism develops antibodies against viral proteins, including 
the major envelope glycoproteins gpl20 and gp41. Despite this humoral immunity, the disease progresses, 
resulting in a lethal immunosuppression characterized by multiple opportunistic infections, parasitemia, 
dementia and death. The failure of host anti-viral antibodies to arest the progression of the disease 
represents one of the most vexing and alarming aspects of the infection, and augurs poorly for vaccination 

25 efforts based upon conventional approaches. 

Two factors may play a rote in the Inefflcacy of the humoral response to immunodeficiency viruses. 
Rrst, like other RNA viruses (and like retroviruses in particular), the immunodeficiency viruses show a high 
mutation rate which allows antigenic variation to progress at a high rate in response to host immune 
surveillance. Second, the envelope glycoproteins themselves are heavily glycosylated molecules presenting 

30 few epitopes suitable for high affinity antibody binding. The poorly antigenic, "moving" target which tiie viral 
envelope presents, allows the host little opportunity for restricting viral infection by specific antibody 
production. 

Cells infected by the HIV virus express the gp120 glycoprotein on tiieir surface. Gp120 mediates fusion 
events among CD4 cells via a reaction similar to that by which ttie virus enters tiie uninfected cell, leading 
35 to the formation of short-lived multinucleated giant cells. Syncytium formation is dependent on a direct 
interaction of the gp120 envelope glycoprotein with the CD4 protein. Dalgleish et aL. supra , Klatzmann, D. 
et aL, Nature 312:763 (1984): McDougal. J.S. et aL Science , 231:382 (1986); SodroskI, J. et aL, Nature . 
322:470 (1986); Ufson, J.D. et aL, Nature , 323:725 (1986); SodrosW. J. et aL, Nature. 321:412 (T986). 

The CD4 protein consists of a 370 amino acid extracellular region containing four immunogtobulin-like 
40 domains, a membrane spanning domain, and a charged intracellular region of 40 amino acid residues. 
Maddon, P. et aL. Cell 42:93 (1985); Clark, S. et aL, Proc. NatL Acad. ScL (USA) 84:1649 (1987). 

Evidence tiiat CD4-gp120 binding is responsible for viral infection of cells bearing the CD4 antigen 
includes the finding tiiat a specific complex is formed between gp120 and CD4. McDougal et aL, supra. 
Other workers have shown that cell lines, which were non-infective for HIV, were converted to infectable cell 
46 lines following transfection and expression of tiie human CD4 cDNA gene. Maddon et aL, Cell 47:333-348 
(1986). 

In contrast to the majority of antibody-envelope interactions, the receptor-envelope interaction is 
characterized by a high affinity (Ka « 10° I/mole) immutable association. Moreover, ttie affinity of the virus for 
CD4 is at least 3 orders of magnitude higher than the affinity of CD4 for its putative endogenous ligand, ttie 
50 MHO class II antigens. Indeed, to date, a specific physical association between monomeric CD4 and class II 
antigens has not been demonstrated. 

In response to bacterial or otiier particle infection, the host organism usually produces serum antibodies 
that bind to specific proteins or carbohydrates on tiie bacterial or particle surface, coating the bacteria. This 
antibody coat on tiie bacten'um or other particle stimulates cytolysis by Fc-receptor-bearing lymphoid ceils 
by antibody-dependent cellular toxicity (ADCC). Other serum proteins, collectively called complement (C). 
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bind to antibody-coated targets, and also can coat foreign particles nonspeclfically. They cause cell death 
by lysis, or stimulate Ingestion by binding to specific receptors on the macrophage called complement 
receptors. See Darnell J. et al., In Molecular Cell Biology, Scientific American Books, pp. 641 and 1087 
(1988). 

5 The most effective complement activating classes of human Ig are IgM and IgQi. The complement 
system consists of 14 proteins that acting In order, cause lysis of cells. Nearly all of ttie C proteins exist In 
nomnal seoim as inactive precursors. When activated, some become highly specific proteolytic enzymes 
whose substrate is the next protein in a sequential chain reaction. 

The entire C sequence can be triggered by either of two Initiation patiiways. In one (the classic 

10 pathway), Ab*Ag complexes bind and activate CI, C4 and C2 to form a C3-spiitting enzyme. In ttie second 
patiiway. polysaccharides commonly on the surface of many bacteria and fungi bind with trace amounts of 
a C3 fragment and then with two other proteins (factor B and properdin) to fonm another C3-splitting 
enzyme. Once C3 is split by either pathway, the way is open for the remaining sequence of steps which 
lead to cell lysis. See Davis. B.D.. et al.. In Microbiology , 3rd ed.. Harper and Row, Philadelphia, PA. pp. 

16 452-466 (1980). 

A number of workers have disclosed methods for preparing hybrid proteins. For example, Murphy, 
United States Patent 4,675.382 (1987), discloses the use of recombinant DNA techniques to make hybrid 
protein molecules by forming the desired fused gene coding for a hybrid protein of diptheria toxin and a 
polypeptide ligand such as a hormone, followed by expression of the fused gene. 

20 Many workers have prepared monoclonal antibodies (Mabs) by recombinant DNA techniques. Mon- 
oclonal antibodies are highly specific well-characterized molecules in both primary and tertiary ..structure. 
They have been widely used for in vitro immunochemical characterization and quantitation of antigens. 
Genes for heavy and light chains have been introduced into appropriate hosts and expressed, followed by 
reaggregation of the Individual chains Into functional antibody molecules (see, for exanriple, Munro, Nature 

26 312:597 (1984); Morrison. S.L, Science 229:1202 (1985); Oi et aL. Biotechniques 4:214 (1986); Wood et al.. 
Nature 314:446-449 (1985)). Light- and heavy-chain variable regions have been*cloned and expressed In 
foreign hosts wherein tiiey maintained their binding abilfty (Moore et al.. European Patent Application 
0088994 (published September 21. 1983)). 

Chimeric or hybrid antibodies have also been prepared by recombinant DNA techniques. 01 and 

30 Monison. Biotechniques 4:214 (1986) describe a strategy for producing such chimeric antibodies which 
include a chimeric humanlgG anti-leu3 antibody. 

Gascoigne, N.R.J.. et al., Proc. Natl. Acad. Sci. (USA) 84:2936-2940 (1987) disclose the preparation of a 
chimeric gene construct containing a T-cell receptor a-chain variable (V) domain and tiie constant (C) region 
coding sequence of an immunoglobulin ^2a molecule. Cells transfected with tiie chimeric gene synthesize a 

35 protein product that expresses immunoglobulin and T-cell receptor antigenic determinants as well as protein 
A binding sites. This protein associates with a normal X chain to form an apparentiy normal tetrameric 
(Ha La, where H= heavy and L = light) immunoglobulin molecule that is secreted. 

Sharon, J., et aL, Nature 309:54 (1984), disclose construction of a chimeric gene encoding tiie variable 
(V) region of a mouse heavy chain specific for the hapten azophenylarsonate and tiie constant (C) region of 

40 a mouse kappa light chain (VhCk). This gene was introduced into a mouse myeloma cell line. The chimeric 
gene was expressed to give a protein which associated with light chains secreted from tiie myeloma cell 
line to give an antibody molecule specific for azophenylarsonate. 

Morrison. Science 229:1202 (1985), discloses that variable light-or variable heavy-chain regions can be 
attached to a non-lg sequence to create fusion proteins. This article states that the potential uses for tiie 

46 fusion proteins are three: (1) to attach antibody specifically to enzymes for use in assays; (2) to isolate non- 
lg proteins by antigen columns; and (3) to specifically deliver toxic agents. 

Recent techniques for tiie stable introduction of immunoglobulin genes into myeloma cells (Banerji, J., 
et aL, Cell 33:729-740 (1983); Potter, H., et al., Proc. Nati. Acad. Sci. (USA) 81^:7161-7165 (1984)). coupled 
witF detailed structural information, have permitted tiie use of in vitro DNA methods such as mutagenesis, 

so to generate recombinant antibodies possessing novel properties. 

PCT Application W087/02671 discloses methods for producing genetically engineered antibodies of 
desired variable region specificity and constant region properties tiirough gene cloning and expression of 
light and heavy chains. The mRNA fi^om cloned hybridoma B cell lines which produce monoclonal 
antibodies of desired specificity is Isolated for cONA cloning. The generation of light and heavy chain 

55 coding sequences is accomplished by excising the cloned variable regions and ligating tiiem to light or 
heavy chain module vectors. This gives cONA sequences which code for immunoglobulin chains. The lack 
of introns allows these cONA sequences to be expressed in prokaryotic hosts, such as bacteria, or in lower 
eukaryotic hosts, such as yeast. 
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The generation of chimeric antibodies in which the antigen*binding portion of the immunoglobulin is 
fused to other moieties has been demonstrated. Examples of non-immunoglobulin genes fused to anti* 
bodies include Stanphylococcus aureus nuclease, the mouse oncogene c*myc. and the Klenow fragment of 
E coii DNA polymerase I (Neuberger, M.S.. et aL, Nature 312:604^12 (1984); Neuberger. M.S., Trends in 
5 Biochemical Science, 347-349 (1985)). European Patent Application 120,694 discloses the genetic engineer- 
ing of the variable and constant regions of an immunoglobulin molecule that is expressed in.E. coli host 
ceils. It is further disclosed that tiie immunoglobulin molecule may be syntiiesized by a host ceil with 
another peptide moiety attached to one of the constant domains. Such peptide moieties are described as 
either cytotoxic or enzymatic. The application and tiie examples describe the use of a lambda-like chain 
10 derived from a monoclonal antibody which binds to 4-hydroxy-3-nitrophenyl (NP) haptens. 

European Patent Application 125.023 relates to tiie use of recombinant DNA techniques to produce 
immunoglobulin molecules that are chimeric or otfierwise modified. One of tfie uses described for these 
immunoglobulin molecules is for whole-body diagnosis and treatment by injection of the antibodies directed 
to specific target tissues. The presence of the disease can be determined by attaching a suitable label to 
75 the antibodies, or the diseased tissue can be attacked by canving a suitable drug witii tiie antibodies. The 
application describes antibodies engineered to aid the specific delivery of an agent as "altered antibodies." 

PCT Application W083/101533 describes chimeric antibodies wherein the variable region of an im- 
munoglobulin molecule is linked to a portion of a second protein which may comprise tiie active portion of 
an enzyme. 

20 Soulianne et a[„ Nature 312:643 (1984) constructed an immunogiobuiirrgsne in which the DNA 

segments that encode mouse variable regions specific for tiie hapten trinitrophenol (TNP) are joined to 
segments that encode human mu and kappa regions. These chimeric genes were expressed to give 
functional TNP-binding chimeric IgM. 

Morrison et al., P.NAS. (USA) 81:6851 (1984), disclose a chimeric molecule utilizing the heavy-chain 
25 variable region exons of an anti-phosphoryl choline myeloma protein Q, which were joined to tiie exons of 
either human kappa light-chain gene. The genes were transfected into mouse myeloma cell lines, 
generating transformed cells tiiat produced chimeric mouse-human tgG with antigen-binding function. 

Despite the progress Uiat has been achieved on detemnlning the mechanism of HIV infection, a need 
continues to exist for methods of treating HIV viral infections. 

30 

SUMMARY OF THE INVENTION 



as The invention relates to a gene comprising a DNA sequence which encodes a fusion protein comprising 
1) CD4, or a fragment tiiereof which binds to HIV gp120. and 2) an immunoglobulin light or heavy chain; 
wherein said CD4 or HIV gp120-bindlng fragment tiiereof replaces ttie variable region of the light or heavy 
immunoglobulin chain. 

The invention also relates to vectors containing the gene of the invention and hosts transformed with tiie 
40 vectors. 

The invention also relates to a method of producing a fusion protein comprising CD4, or fragment 
thereof which binds to HIV gp120, and an immunoglobulin light or heavy chain, wherein tiie variable region 
of the immunoglobulin light or heavy chain has been substituted witii CD4, or HIV gp120-binding fragment 
thereof, which comprises: 

43 cultivating In a nutrient medium under protein producing conditions, a host strain transformed with tiie 

vector containing the gene of tiie invention, said vector furttier comprising expression signals which are 

recognized by said host strain and direct expression of said fusion protein, and 

recovering the fusion protein so produced. 

The invention also relates to a fusion protein comprising CD4, or fragment thereof which is capable of 
50 binding to HIV gp120, fused at the C-terminus to a second protein which comprises an immunoglobulin light 

or heavy chain, wherein tiie variabte region of said light or heavy chain is substituted witii CD4 or a HIV 

gp120 binding fragment thereof, 
f The invention also relates ito an immunogiobulin-like molecule comprising the fusion protein of the 

invention togetiier with an immunoglobulin light or heavy-chain, wherein said immunoglobulin like molecule 
55 binds HIV gp1 20. 

The IgGi fusion proteins and immunogiobulln-like molecules may be useful for botii complement- 
mediated and cell-mediated (AOCC) immunity, while tiie IgM fusion proteins are useful principally through 
complement-mediated immunity. 
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The invention also relates to a complex between the fusion proteins and immunoglobulin-fike molecule 
of the invention and HiV gp120. 

The Invention also relates to a method for treating l-IIV or SIV Infections comprising administering the 
fusion protein or Immunoglobulin-like molecule of the invention to an animal. 
5 The invention further relates to a method for detecting IHiV gp120 in a sample comprising contacting a 
sample suspected of containing IHIV or gp120 with the fusion protein or immunoglobulin-lilce molecule of the 
invention, and detecting whether a complex has formed. 



10 DESCRIPTION OF THE PREFERRED EMBODIMENTS 



The invention is directed to a protein gene which comprises 
1) a DNA sequence which codes for 004. or fragment thereof which binds to HIV gp120, fused to 
IS 2) a DNA sequence which encodes an immunoglobulin heavy chain. 

Preferably, the antibody has effector function. 
The invention is also directed to a protein gene which comprises 
1) a DNA sequence which codes for CD4. or fragment thereof which binds to HIV gpl20. fused to 
20 2) a DNA sequence which encodes an immunoglobulin light chain; wherein said sequence which 

codes for CD4. or HIV gp120-binding fragm^ent-thereot replaces _the variable region of the light im- 
munoglobulin chain. 

The invention is also directed to the expression of these novel fusion proteins In transformed hosts and 
26 the use thereof to treat and diagnose HIV infections. In particular, the invention relates to expressing said 
genes in mammalian hosts which express complementary light or heavy chain immunoglobulins to give 
immunoglobulin-like molecules which have antibody effector function and also bind to HIV or StV gp120. 

The term "antibody effector function" as used herein denotes the ability to fix complement or to 
activate ADCC. 

30 The fusion proteins and immunogiobulin-like molecules may be administered to an animal for the 
purpose of treating HIV or SIV infections. By the terms "HIV infections" is intended the condition of having 
AIDS, AIDS related complex (ARC) or where an animal harbors the AIDS virus, but does not exhibit the 
clinical symptoms of AIDS or ARC. By the terms "SIV infections" is intended the condition of being infected 
witii simian immunodeficiency virus. 

35 By the term "animal" is intended all animals which may derive benefit from tiie administration of tiie 
fusion proteins and Immunoglobulin-like molecules of the invention. Foremost among such animals are 
humans, however, tiie invention is not intended to be so limited. 

By tiie term "fusion protein" is intended a fused protein comprising CD4. or fragment ttiereof which is 
capable of binding to gp120, linked at Its C-terminus to an immunoglobulin chain wherein a portion of tiie N- 

40 terminus of the immunoglobulin is replaced with CD4. In general, that portion of immunoglobulin which is 
deleted is tiie variable region. The fusion proteins of the invention may also comprise immunoglobulins 
where more than just tiie variable region has been deleted and replaced with C04 or HIV gp120 binding 
fragment thereof. For example, the Vh and CHI regions of an immunoglobulin chain may be deleted. 
Preferably, any amount of the N-terminus of the immunoglobulin heavy chain can be deleted as long as the 

45 remaining fragment has antibody effector function. The minimum sequence required for binding com- 
plement encompasses domains CH2 and CHS. Joining of Fc portions by tiie hinge region is advantageous 
for Increasing the efficiency of complement binding. 

The CD4 portion of the fusion protein may comprise the complete CD4 sequence, the 370 amino acid 
extracellular region and tiie membrane spanning domain, or tiie extracellular region. The fusion protein may 

50 comprise fragments of the extracellular region obtained by cutting tiie DNA sequence which encodes CD4 
at tiie BspMI site at position 514 or tiie Pvull site at position 629 (see Table 1) to give nucleotide 
sequences which encode CD4 fragments which retain binding to gp120. In general, any fragment of CD4 
may be used as long as it retains binding to gp120. 

Where tiie fusion protein comprises an immunoglobulin light chain, it is necessary tiiat no more of the 

55 Ig chain be deleted than is necessary to form a stable complex with a heavy chain Ig. In particu lar, the 
cysteine residues necessary for disulfide bond formation must be preserved on both the heavy and light 
chain moieties. 

When expressed in a host, e.g., a mammalian celt, the fusion protein may associate with other light or 
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heavy Ig chains secreted by the cell to give a functioning immunoglobulin-like molecule which is capable of 
binding to gp120. The gp120 may be in solution, expressed on the surface of infected cells, or may be 
present on the surface of the HIV virus itself. Alternatively, the fusion protein may be expressed In a 
mammalian cell which does not secrete other light or heavy Ig chains. When expressed under these 
s conditions, the fusion protein may form a homodimer. 

Genomic or CDNA sequences may be used in the practice of the invention. Genomic sequences are 
expressed efficiently in myeloma ceils, since they contain native promoter structures. 

The constant regions of the antibody cloned and used in the chimeric immunoglobulin-like molecule 
may be derived from any mammalian source. The constant regions may be complement binding or ADCC 

10 active. However, preliminary work (see Examples) indicates that the fusion proteins of the invention may 
mediate HIV or SIV infected cell death by an ADCC or complement-independent mechanism. The constant 
regions may be derived from any appropriate isotype. Including lgG1, lgG3, or IgM. 

The joining of various DNA fragments, is performed in accordance with conventional techniques, 
employing blunt-ended or staggered-ended termini for ligation, restriction enzyme digestion to provide 

15 appropriate termini, filling in of cohesive ends as appropriate, alkali and phosphatase treatment to avoid 
undesirable joining, and ligation with appropriate ligases. The genetic construct may optionally encode a 
leader sequence to allow efficient expression of the fusion protein. For example, the leader sequence 
utilized by Madden et al.. Cell 42:93-104 (1985) for the expression of C04 may be used. 

For cDNA, the cDNA may be cloned and the resulting clone screened, for example, by use of a 

20 complementary probe or by assay for expressed C04 using an antibody as disclosed by Dalgleish et al., 
Nature 312:763-766 (1984); Klatzmann et aL. Immunol. Today 7:291-297 (1986); McDougal et al., J. 
Immunol. 135:3151-3162 (1985); and McDougal, J. et aL, J. ImmunoT 137:2937-2944 (1986). 

To express the fusion hybrid protein, transcriptional and translational signals recognized by an 
appropriate host element are necessary. Eukaryotic hosts which may be used include mammalian cells 

25 capable of culture in vitro , particularly leukocytes, more particularly myeloma cells or other transformed or 
oncogenic lymphocytes, e.g.. EBV-transformed cells. Alternatively, non-mammalian cells may be employed, 
such as bacteria, fungi. e.g., yeast, filamentous fungi, or the like. 

Prefen'ed hosts for fusion protein production are mammalian cells, grown in vitro in tissue culture or in 
vivo in animals. Mammalian cells provide post translational modification to immunoglobulin protein moT 

30 ecules which provide for correct folding and glycosylation of appropriate sites. Mammalian cells which may 
be useful as hosts Include cells of fibroblast origins such as VERO or CHOK1 or cells of lymphoid origin, 
such as the hybridoma SP2/0-AG14 or the myeloma P3x63Sgh. and their derivatives. For the purpose of 
preparing an immunoglobulin-like molecule, a plasmid containing a gene which encodes a heavy chain 
immunoglobulin, wherein the variable region has been replaced with CD4 or fragment thereof which binds to 

35 gpl20, may be introduced, for example, into J558L myeloma cells, a mouse plasmacytoma expressing the 
!ambda-1 light chain but which does not express a heavy chain (see Oi et al., P.NAS. (USA) 80:825-829 
(1983)). Other prefenred hosts include COS cells. BHK cells and hepatoma cells. 

The constructs may be joined together to form a single DNA segment or may be maintained as 
separate segments, by themselves or in conjunction with vectors. 

40 Where the fusion protein is not glycosylated, any host may be used to express the protein which is 
compatible with replicon and control sequences in the expression plasmid. In general, vectors containing 
replicon and control sequences are derived from species compatible with a host cell are used in connection 
with the host The vector ordinarily carries a replicon site, as well as specific genes which are capable of 
providing phenotypic selection in transformed cells. The expression of the fusion protein can also be placed 

45 under control with other regulatory sequences which may be homologous to the organism in its untransfor- 
med state. For example, lactose-dependent E coli chromosomal DNA comprises a lactose or lac operon 
which mediates lactose utilization by elaborating the enzyme beta-galactosidase. The lac control elements 
may be obtained from bacterial phage lambda plac5, which is infective for E. coli. The lac promoter- 
operator system can be Induced by IPTG. 

50 Other promoters/operator systems or portions thereof can be employed as well. For example, collcin 
El. galactose, alkaline phosphatase, tryptophan, xylose. and the like can be used. 

For mammalian hosts, several possible vector systems are available for expression. One class of 
vectors utilize DNA elements which are derived from animal viruses such as bovine papilloma virus, 
polyoma virus, adenovirus, vaccinia vims, baculovirus, retroviruses RSV, MMTV or MOMLV), or SV40 virus. 

55 Cells which have stably integrated the DNA into their chromosomes may be selected by introducing one or 
more maricers which allow selection of transfected host cells. The marker may provide for prototropy to an 
auxotrophic host, biocide resistance, e.g.. antibiotics, or heavy metals such as copper or the like* The 
selectable marker gene can be either directly linked to the DNA sequences to be expressed, or introduced 
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into the same cell by cotransformation. Additional elements may also be needed for optimal synthesis of 
mRNA. These elements may include splice signals, as well as transcriptional promoters, enhancers, and 
termination signals. The cDNA expression vectors incorporating such elements includes those described by 
Okayama. H., MoL CeL Biol. , 3:280 (1983) and others. 

6 Once the vector or DNA'sequence containing the constructs has been prepared for expression, the 
DNA constructs may be introduced to an appropriate host Various techniques may be employed, such as 
protoplast fusion, calcium phosphate precipitation, electroporation or other conventional techniques. After 
the fusion, the cells are grown in media and screened for the appropriate activity. Expression of the gene(s) 
results in production of the fusion protein. This expressed fusion protein may then be subject to further 

70 assembly to form the immunoglobulin-IIke molecule. 

The host cells for immunoglobulin production may be immortalized cells, primarily myeloma or 
lymphoma cells. These cells may be grown in appropriate nutrient medium in culture flasks or injected into 
a synergistic host, e.g., mouse or a rat, or immunodeficlent host or host site. e.g.. nude mouse or hamster 
pouch. In particular, the cells may be introduced into the abdominal cavity of an animal to albw production 

76 of ascites fluid which contains the immunoglobulin-like molecule. Altematively, the cells may be injected 
subcutaneousiy and the chimeric antibody is harvested from the blood of the host. The cells may be used 
in the same manner as hybridoma cells. See Diamond et al., N. Eng. J. Med. 304:1344 (1981), and Kennatt. 
McKeam and Bechtol (Eds.), Monoclonal Antibodies: Hybridomas: ~ A"New Dimension in Biologic Analysis , 
Plenum, 1980. 

20 The fusion proteins and immunoglobulin-like molecules of the invention may be isolated and purified in 

accordance with conventional conditions, such as extraction, precipitation^ chjocnatography,. affinity 

chromatography, electrophoresis or the like. For example, the IgGI fusion proteins may be purified by 
passing a solution through a column which contains immobilized protein A or protein G which selectively 
binds the Fc portion of the fusion protein. See. for example, Reis, KJ., et aL, J, Immunol. 132 :3098-3102 
25 (1984); PCT Application, Publication No. W087/00329. The chimeric antibody may the be eluted by 
treatment with a chaotropic salt or by elution with aqueous acetic acid (1 M). 

Alternatively the fusion proteins may be purified on anti-CD4 antibody columns, or on anti-im- 
munoglobulin antibody columns* 

In one embodiment of the invention, cDNA sequences which encode CD4. or a fragment thereof which 
30 binds gp120, may be ligated into an expression plasmid which codes for an antibody wherein the variable 
region of ttie gene has been deleted. Methods for the preparation of genes which encode the heavy or light 
chain constant regions of immunoglobulins are taught, for example, by Robinson. R. et aL, PCT Application, 
Publication No, W087-02671 . ^ 

Preferred immunoglobulin-like molecules which contain CD4. or fragments thereof, contain the constant 
35 region of an IgM. IgGI or lgG3 antibody which binds complement at the Fc region. 

The fusion protein and immunoglobulin-like molecules of tiie Invention may be used for the treatment of 
HIV viral infections. The fusion protein complexes to gp120 which is expressed on infected cells. Although 
ttie inventor is not bound by a particular theory, it appears that tiie Fc portion of ttie hybrid fusion protein 
may bind with compiement, which mediates destruction of the cell. In tills manner, infected cells are 
40 destroyed so tiiat additional viral particle production is stopped. 

For the purpose of treating HIV infections, Uie fusion protein or immunoglobulin-like molecule of the 
invention may additionally contain a radioiabel or therapeutic agent which enhances destruction of the HIV 
particle or HIV-infected cell. 

Examples of radioisotopes which can be bound to tiie fusion protein or immunoglobulin-like molecule of 
46 the invention for use in HIV-therapy are ^^\, "Y. "Cu, ^i^Bi, ziiAt, 2i2pb, *7Sc, and ^"Pd. Optionally, 
a label such as boron can be used which emits a and fi particles upon bombardment with neutron radiation. 

For in vivo diagnosis radionucleotides may be bound to the fusion protein or immunoglobulin-like 
moiecule"bf the invention either directiy or by using an intennediary functional group. An intermediary group 
which is often used to bind radioisotopes, which exist as metallic cations, to antibodies is 
50 diethylenetriaminepentaacetic acid (DTPA). Typical examples of metallic cations which are bound in this 
manner are ««"Tc '^l ^^Mn, ^^M, a^Ru. s^cu, s^Qa. and "Qa, 

Moreover, the fusion protein and immunoglobulin-like molecule of the invention may be tagged with an 
NMR imaging agent which include paramagnetic atoms. The use of an NMR imaging agent allows the in 
vivo diagnosis of tfie presence of and the extent of HIV infection within a patient using NMR techniques. 
55 Elements which are particulariy useful in tills manner are ^^^Gd, "Mn, '^^oy, 52Cr, and "Fe. 

Therapeutic agents may include, for example, bacterial toxins such as diphtiieria toxin, or ricin. Methods 
for producing fusion proteins comprising fragment A of diphtheria toxin are taught in U.S. Patent 4,675.382 
(1987). Diphtheria toxin contains two polypeptide chains. The B chain binds tiie toxin to a receptor on a cell 
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surface. The A chain actually enters the cytoplasm and inhibits protein synthesis by inactivating elongation 
factor 2, the factor that translocates ribosomes along mRNA concomitant witii hydnDlysis of ETP. See 
Damell. J., et aL, in Molecular Cell Biology , Scientific American Books, Inc., page 662 (1966). Alternatively, 
a fusion protein comprising ricin, a toxic lectin, may be prepared. 

5 Introduction of the chimeric molecules by gene ttierapy may also be contemplated, for example, using 
retroviruses or other means to introduce ttie genetic material encoding ttie fusion proteins into suitable 
target tissues. In this embodiment, tiie target tissues having the cloned genes of the invention may tiien 
produce the fusion protein in vivo. 

The dose ranges for ttie administration of ttie fusion protein or immunoglobulln-like molecule of tiie 

70 invention are those which are large enough to produce ttie desired effect whereby tfie symptoms of HIV or 
SIV Infection are ameliorated. The dosage should not be so large as to cause adverse side effects, such as 
unwanted cross-reactions, anaphylactic reactions, and tiie like. Generally, the dosage will vary with the age, 
condition, sex and extent of disease In the patient, counterindications, If any, immune tolerance and other 
such variables, to be adjusted by ttie individual physician. Dosage can vary from .01 mg/kg to 50 mg/kg, 

75 preferably 0.1 mg/kg to 1.0 mg/kg, of ttie immunoglobulln-like molecule in one or more administrations 
daily, for one or several days. The immunoglobulin-like molecule can be administered parenterally by 
injection or by gradual perfusion over time. They can be administered Intravenously, Intraperitoneally. 
intramuscularly, or subcutaneously. 

Preparations for parenteral administration include sterile or aqueous or non-aqueous solutions, suspen* 
-^20 sions, and emulsions. Examples of non-aqueous solvents are^ropyiene giycot, polyethylene glycol, 
vegetable oils such as olive oil, and injectable organic esters such as ethyl oleate. Aqueous carriers include 
water, alcoholic/aqueous solutions, emulsions or suspensions, including saline and buffered media Paren- 
teral vehicles include sodium chloride solution, Ringer's dextrose, dextrose and sodium chloride, lactated 
Ringer's, or fixed oils. Intravenous vehicles include fluid and nutrient reptenishers, electrolyte replenishers, 

25 such as ttiose based on Ringer's dextrose, and ttie like. Preservatives and ottier additives may also be 
present, such as, for example, antimicrobials, antioxidants, chelating agents, inert gases and ttie like. See, 
generally. Remington's Pharmaceutical Science , iBtti Ed.. Mack Eds.. 1980. 

The invention also relates to a method for preparing a medicament or pharmaceutical composition 
comprising ttie components of ttie invention, ttie medicament being used for therapy of HIV or SIV infection 

30 in animals. 

The detection and quantitation of antigenic substances and biological samples frequently utilized 
' immunoassay techniques. These techniques are based upon ttie formation of the complex between tiie 
antigenic substance, e.g., gp120. being assayed and an antibody or antibodies in which one or the other 
member of ttie complex may be detectably labeled. In ttie present invention, the immunoglobulin-like 
35 molecule or fusion protein may be labeled with any conventional label. 

Thus, the hybrid fusion protein or immunoglobulln-like molecule of the Invention can also be used in 
assay for HIV or SIV viral infection in a biological sample by contacting a sample, derived from an animal 
suspected of having an HIV or SIV infection, with the fusion protein or immunoglobulin-like molecule of ttie 
invention, and detecting whettier a complex with gp120, eittier alone or on the surface of an HIV-infected 
40 cell, has formed. 

For example, a biological sample may be treated with nitrocellulose, or other solid support which is 
capable of immobilizing cells, ceil particles or soluble protein. The support may ttien be washed witti 
suitable buffers followed by treatment with the fusion protein which may be detectably labeled. The solid 
phase support may Uien be washed with tiie buffer a second time to remove unbound fusion protein and 
45 the label on ttie fusion protein detected. 

In canning out the assay of tiie present invention on a sample containing gp120, the process 
comprises: 

a) contacting a sample suspected containing gp120 witii a solid support to effect immobilization of 
gp120, or cell which expresses gp120 on its surface; 
50 b) contacting said solid support witii the detectably labeled immunoglobulln-like molecule or fusion 

protein of ttie invention; 

c) incubating said detectably labeled immunoglobulin-like molecule with said support for a sufficient 
amount of time to allow the immunoglobulin-like molecule or fusion protein .to bind to ttie immobilized 
gp120 or cell which expresses gpl20 on its surface; 
55 d) separating ttie solid phase support from the incubation mixture obtained in step c); and 

e) detecting ttie bound immunoglobulln-like molecule or fusion protein and ttiereby detecting and 
quantifying gp120. 
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Alternatively, labeled Immunoglobulin-like molecule (or fusion protein) -gp120 complex in a sample may 
be separated from a reaction mixture by contacting the complex with an immobilized antibody or protein 
which Is specific for an immunoglobulin or. e.g.. protein A, protein G, anti-IgM or anti-IgG antibodies. Such 
anti-immunoglobulin antibodies may be monoclonal or polyclonal. The solid support may then be washed 
5 with suitable buffers to give an immobilized gpl20-iabeled immunoglobulin-like molecule antibody complex. 
The label on the fusion protein may then be detected to give a measure of endogenous gp120 and. 
thereby, the presence of HIV. 

This aspect of the invention relates to a method for detecting HIV or SIV viral infection In a sample 
comprising 

70 (a) contacting a sample suspected of containing gp120 with a fusion protein or immunoglobulin-like 

molecule comprising C04, or fragment thereof which binds to gp120, and the Fc portion of an im- 
munoglobulin chain. 

(b) detecting whether a complex is formed. 

15 The invention also relates to a method of detecting gp120 in a sample, further comprising 

(c) contacting the mixture obtained in step (a) with an Fc binding molecule, such as an antibody, 
protein A, or protein Q, which is immobilized on a solid phase support and is specific for the hybrid fusion 
protein, to give a gp120 fusion protein-immobilized antibody complex 

(d) washing the solid phase support obtained in step (c) to remove unbound fusion protein, 
20 (e) and detecting the label on the hybrid fusion protein. 

Of course, the specific concentrations of detectably labeled immunoglobulin-like molecule (or fusion 
protein) and gp120, the temperature and time of incubation, as well as other assay conditions may be 
vaned, depending on various factors including the concentration of gp120 in the sample, the nature of the 

25 sample, and the like. Those skilled in the art wild be able to detemiine operative and optimal assay 
conditions for each detemnination by employing routine experimentation. 

Otiier such steps as washing, stimng, shaking, filtering and the like may be added to tiie assays as Is 
customary or necessary for the particular situation. 

One of tiie ways in which the immunoglobulin-like molecule or fusion protein of the present invention 

30 can be detectably labeled is by linking tiie same to an enzyme. This enzyme, in turn, when later exposed to 
its substrate, will react with, tiie substrate in such a manner as to produce a chemical moiety which can be 
detected as, for example, by spectrophotometric. fluorometric or by visual means. Enzymes which can be 
used to detectably label the immunoglobulin-like molecule or fusion protein of tiie present invention include, 
but are not limited to, malate dehydrogenase, staphylococcal nuclease, detta-V-steroid isomerase, yeast 

35 alcohol dehydrogenase, alpha-glycerophosphate dehydrogenase, triose phosphate isomerase. horseradish 
peroxidase, alkaline phosphatase, asparaginase, glucose oxidase, beta-galactosidase, ribonuclease, urease, 
catalase, glucose-VI-phosphate dehydrogenase, glucoamylase and acetylcholine esterase. 

The immunoglobulin-like molecule or fusion protein of tiie present invention may also be labeled with a 
radioactive isotope which can be determined by such means as the use of a gamma counter or a 

40 scintillation counter or by autoradiography. Isotopes which are particularly useful for tiie purpose of tiie 
present invention are: ^H. ^^si, i3i|^ 32p^ asg, uc, ^'Cr. 3«GI, "Co, ^«Co. ^^Fe and '^se. 

It is also possible to label the Immunoglobulin-like molecule or fusion protein witii a fluorescent 
compound. When the fluorescently labeled immunoglobulin-like molecule is exposed to light of the proper 
wave length, its presence can tiien be detected due to tiie fluorescence of the dye. Among tiie most 

45 commonly used fluorescent labelling compounds are fluorescein isotiiiocyanate, rhodamine. phycoerytfierin, 
phycocyanin. allophycocyanin. o-phthaldehyde and fluorescamine. 

The immunoglobulin-like molecule or fusion protein of the invention can also be detectably labeled 
using fluorescence emitting metals such as ^^^Eu. or others of the lanthanide series. These metals can be 
attached to tiie immunoglobulin-like molecule or fusion protein using such metal chelating groups as 

50 diethyleneti'iaminepentaacetic acid (OTPA) or ettiylenediaminetetraacetic acid (EDTA), 

Thd immunoglobulin-like molecule or fusion protein of tiie present invention also can be detectably 
labeled by coupling it to a chemiluminescent compound. The presence of tiie chemiluminescent-tagged 
immunoglobulin-like molecule or fusion protein is tiien determined by detecting tiie presence of lumines- 
cence tiiat arises during the course of a chemical reaction. Examples of particularly useful chemilumines- 

55 cent labeling compounds are luminol. isoluminol, theromatic acridinium ester, imidazole, acridinium salt and 
oxalate ester. 

Likewise, a bioluminescent compound may be used to label the immunoglobulin-like molecule or fusion 
protein of the present invention. Bloluminescence is a type of chemiluminescence found in biological 
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systems in which a catalytic protein increases the efficiency of the chemiluminescent reaction. The 
presence of a bioluminescent protein is determined by detecting the presence of luminescence. Important 
bioluminescent compounds for purposes of labeling are iuciferin, luciferase and aequorin. 

Detection of the Immunoglobulin-IIke molecule or fusion protein may be accomplished by a scintillation 

5 counter, for example, if the detectable label is a radioactive gamma emitter, or by a fluorometer, for 
example, if the label is a fluorescent material. In the case of an enzyme label, the detection can be 
accomplished by colorimetric methods which employ a substrate for the enzyme. Detection may also be 
accomplished by visual comparison of the extent of enzymatic reaction of a substrate in comparison with 
similarly prepared standards. 

70 The assay of the present invention is ideally suited for the preparation of a kit. Such a kit may comprise 
a carrier means being compartmentalized to receive in close confinement therewith one or more container 
means such as vials, tubes and the like, each of said container means comprising the separate elements of 
the immunoassay. For example, there may be a container means containing a solid phase support, and 
further container means containing the detectably labeled immunogiobulin*iike molecule or fusion protein in 

75 solution. Further container means may contain standard solutions comprising serial dilutions of analytes 
such as gpl20 or fragments thereof to be detected. The standard solutions of these analytes may be used 
to prepare a standard curve with the concentration of gp120 plotted on the abscissa and the detection 
signal on the ordinate. The results obtained from a sample containing gp120 may be interpolated from such 
a plot to give the concentration of gp120. 

20 The immunoglobultn-iike molccuia or fusion protein of the present invention can also be used as a stain 
for tissue sections. For example, a labeled immunoglobulin*like molecule comprising CD4 or fragment 
thereof which binds to gp120 may be contacted with a tissue section, e.g., a brain biopsy specimen. This 
section may then be washed and the label detected. 

The following examples are illustrative, but not limiting the method and composition of the present 

25 invention. Other suitable modifications and adaptations which are obvious to this skill in the art are within 
the spirit and scope of this invention. 



EXAMPLES 

30 



Example 1: Preparation of CD4'lg cDNA Constructs 

35 The extracellular portion of the CD4 molecule (See Madden, PJ.. et ah, Cell 42:93-104 (1985)) was 
fused at three locations in a human IgGI heavy chain constant region gene by means of a synthetic splice 
donor linker molecule. To exploit the splice donor linker, a BamHI linker having the sequence 
CGCGGATCCGCG was first inserted at amino acid residue 395 of the CD4 precursor sequence (nucleotide 
residue 1295). A synthetic splice donor sequence 

40 

GATCCCGAGGGT6AGTACTA 
GGCTCCCACTCATGAHCGA 

45 

bounded by BamlHI and Hindlll complementary ends was created and fused to the Hindlll site in the intron 
preceding the CH1 domain, to the Espl site in the intron preceding the hinge domain, and to the BanI site 
preceding the CH2 domain of the lgG1 genomic sequence. Assembly of the chimeric genes by ligation at 
the BamHI site afforded molecules in which either the variable (V) region, the V-1-CHI regions, or the V, 
CHI and hinge regions were replaced by CD4. In the last case, the chimeric molecule is expected to form a 
monomer structure, while in the former, a dimeric molecule is expected. 

On such genetic construct which contains the ONA sequence which encodes CD4 linked to human IgGI 
at the Hind3 site upstream of the CHI region (fusion protein CD4H7I) is depicted in Table 1. The piasmid 
containing this genetic construct (PCD4H7I) has been deposited in E. coH (MC1061/P3) at the American 
Type Culture Collection (ATCC) under the terms of the Budapest Treaty and given accession number 
67611. 

A second genetic construct which contains the DNA sequence which encodes CD4 linked to human 
IgGI at the Esp site upstream of the hinge region (fusion protein CD4E7I) is depicted in Table 2. The 
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plasmid containing this genetic construct (PCD4E7I) has been deposited in E. coll (IVIC1061/P3) at the 
ATCC under the terms of the Budapest Treaty and given accession number 67610. 

A third genetic construct which contains the DMA sequence which encodes CD4 linked to human IgM at 
the Mst2 site upstream of the CI-11 region (fusion protein C04Mu) is depicted in Table 3. The ptasmid 
5 containing this genetic construct (PCD4IVIU) has been deposited in E. coli (MC1061/P3) at the ATCC under 
the terms of the Budapest Treaty and given accession number 67608. 

A fourth genetic construct which contains the DNA sequence which encodes CD4 linked to human IgM 
at the Pst site upstream of the CH2 region (fusion protein CD4Pu) is depicted in Table 4. The plasmid 
containing this genetic construct (PCD4Pn) has been deposited in E. coli (MCI 061 /P3) at the ATCC under 
10 the terms of the Budapest Treaty and given accession number 67609. 

A fifth genetic construct which contains the OIMA sequence which encodes CD4 linked to human IgGI at 
the Ban 1 site downstream from the hinge region (fusion protein CD4B7I) is depicted In Table 5. 

Two similar constructs were prepared from the human IgM heavy chain constant region by fusion with 
the introns upstream of the u CHI and CH2 domains at an MStll site and a PStI site respectively. The 
15 fusions were made by joining the PStI site of the CD4/!gG1 construct fused at the Esp site in IgGI gene to 
the MStll and Pst sites in the IgM gene. In the first instance, this was performed by treatment of the Pst end 
with T4 DNA Polymerase and the MStll end with E, coil DNA Polymerase, followed by ligation; and in the 
second instance, by ligation alone. 

Immunoprecipitation of the fusion proteins with a panel of monoclonal antibodies directed against CD4 
20 epitopes showed that all of the epitopes were preserved. A specific high affinity association is demonstrated 
between the chimeric .mpjecules and HIV envelope proteins expressed on the surface of eells-tensfectsd 
with an attenuated (reverse transcriptase deleted) proviral construct 
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VPFRHLLLVLQLALLPAATQ- 

B E E R A 
B C C S L 
V 0 0 A U 
IKK 11 
AGGGAAAGAAAGTGGTGCTCGGCAAAAAAGGCGATACAGTGCAAaGACaGTACACCn 
181 ♦ ♦ ♦ 240 

TCCCTTTCTTTCACCACGACCCCTTTFTTCCCCTATGTCACCnGACTGGACATGTCGAA 
CKKVVLGKKGDTVELTCTAS- 
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ATCACGGCTCCTTeTTAACTAAAGCTCCATCCAACCTGAATGATCGCGCTGACTCAAGAA 

301 ♦ ♦ ♦ ♦ ♦ >♦ 360 

TAGTCCCCACGAAGAATrCATnCCAGCTAGCTTCCACTTACTACCGCCACTCAGnCTT 

OGSFLTKGPSKLNDRAOSRR- 
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GAACCCITTCGGACCAAGGAAACnCCCCCTGATCATCAAGAATCnAAGATAGAAGACT 
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CnCGGAAACCCTGGnCCTTTGAAGGGGGACTAGTACTTCnAGAAnCTATCTTCTGA 
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CAGATACTTACATCTGTGAAGTGGAGGACCAGAAGGAGGAGGTCCAAnGCTACTGnCG 
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GTCTATGAATGTAGACACnCACCTCCTGGTCTTCCTCCTCCACGTTAACGATCACAAGC 

DTYICEVEOQKEEVQLLVFC- 
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481 ♦ ♦ ♦ ♦ ♦ ♦ 540 
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AGGGGGCCAAGACCCTCTCCGTGTCTCAGCTGGAGCTCCAGGATAGTGGCACCTGGACAT 
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GCACTGTCTTGCAGAACCAGAAGAACGTGGAGTTCAAAATAGACATCGTGGTGCTAGCTT 
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CGTGACAGAACGTCT7GGTCT7CTTCCACCTCAAGTTTTATCTGTAGCACCACGATCGAA 
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TCCAGAAGGCCTCCAGCATACTCTATAAGAAAGAGCCGGAACACGTCCAGTTCTCCTTCC 
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CACTCCCCnTACACTTGAAAAGCTGACGGGCAGTGGCGAGCTGTGCTCGCAGGCGCAGA 
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AACGGGnACCCAGCACCCTAAGCTCCAGATGGGCAAGAAGCTCCCGCTCCACCTCACCC 
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rrGCCCAATGGGTCCTGGGAnCGAGGTCTACCCGTTCnCGAGGGCCAGGTGGAGTGGG 

RVTUDPKLdMGKKLPLHLTL- 
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TCCCGGACAGnAAGAACCCAGGGGCCTCTGCGCCTGGGCCCAGCTCTGTCCCACACCGC 
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AGCGCCTGTCAAnCTTGGGTCCCCGGAGACGCGCACCCGGGTCGAGACAGGGTGTGGCC 
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ACnCCCCGAACCGGTGACGGTGTCGTGGAACTCAGGCGCCCTGACCAGCGGCGTGCACA 

1621 4 ♦ 4 4 * ♦ 16S0 

TGAAGGGGCnCGCCACTGCCACAGCACCTTCAGTCCGCGGGACTGGTCGCCCCACGTGT 

H T - 



P E P V T 


V S 


W 


N S 


G A 


L T 


S G 


V 


S 




H 
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B 




HNC 


DM 


I 


M 


D 


N 


M SM 


6 


PCR 


OS 


N 


N 


D 


U 


N TA 


B 


AIF 


ET 


F 


L 


E 


4 


L EE 


V 


211 


12 


1 


1 


1 


H 


1 23 
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II 


/ 










/ 





CCrrCCCGGCTGTCCTACAGTCCTCAGGACTCTACTCCCTCAGCACCGTGGTCACCGTGC 

1681 4 4 4 4 1740 

GGAAGGGCCGACAGGATGTCAGGAGTCCTGAGATGAGGGAGTCGTCGCACCACTGGCACG 
FPAVLQSSGLYSLSSVVTVP- 

B F B B H 
SH N ASM B NSB MI 
PP U LTN A LPB A N 

IH 4 UXL N AlV E F 

21 H 111 1 421 2 1 

/ 

CCTCCAGCAGCnGGGCACCCAGACCTACATCTGCAACGTGAATCACAAGCCCAGCAACA 

1741 4 ♦ ♦ ♦ ♦ ♦ 1800 

GGAGCTCGTCGAACCCGTGCGTCTGGATGTAGACGTTGCACTTAGTGnCCGGTCGTTGT 
SSSLGTHTYICNVNHKPSNT- 
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5 M HM HM 

T N AN PN 

Y L EL HL 

1 1 31 U 

CCAAGGTGGACAAGAAAGTTCCTGACACCCCAGCACAGGGAGCGAGCGTCTCTGaGGAA 

« ^ ^^•—•^•••••••••^ ^ ***** ***** iOOV 

SiTTCCACCTCTTCTnCAACCACTCTCCGGTCCTGTCCCTCCCTCCCACACACCACCTT 

K V D K K V 

E BS SS F BS F . 

DE CHH F SC HHNCF N BSC N 

DS OHA 0 TR PGCRA U BTR U 

EP 4AE K NF AAIFN 4 VNF 4 

11 712 1 n 21111 H 111 H 

CCACGCTCAGCGCTCaCCCTCCACGCATCCCCGCTATGCAGCCCCACTCCAGCCCACCA 

Ififil •• — ••—♦ — -•—-♦•*••—**♦••*••••••**'■**************"*** 

CGTCCGAGTCCCCAGCACGGACCTGCGTAGGGCCGATACGTCGGGGTCAGGTCCCGTCGT 

s s 

OBHMHNA HMNCN M MNOM 

RBABPLU PNCRL N NLDB 

AVE0HA9 ALIFA L LAEO 

2132146 21114 I 1312 

AGGCAGGCCCCCTCTCCCTCnCACCCGGAGCCTCTCCCCGCCCCACTCATGCTCACCGA 

,QOi ♦ « ♦ ♦ ♦ ♦ 1980 

TCCGTCCGGGGCAGACGGAGAAGTGGGCCTCGGAGACGGGCGGGGTGACTACGAGTCCCT 

BS P B BS 

SC F M 6 N S SC 

TR L A A L P TR 

NF M E N A 1 NF 

11 1 114 2 11 

/ / 
GAGCGTCTTCTGGCTTTnCCCAGGCTCTGGGCAGCCACAGCCTACCTGCCCCTAACCCA 

1Q01 * ♦ ♦ - ♦ 2040 

CTCCCAGAACACCGAAAAAGGGTCCGAGACCCCTCCGTCTCCGATCCACGGGGATTGGGT 

SB B B S 

DHA S DBS S M HNC A 

RAU P DAP P N PCR V 

AE9 M ENl M L AIF A 

236 1 122 1 1 211 2 

CGCCCTGCACACAAAGCGGCAGGTGCTGGGCTCAGACCTOCCAAGAGCCATATCCCGGAG 

2041 « ♦ ♦ * ♦ ♦ 2100 

CCGGGACGTGTGTTTCCCCGTCCACGACCCCAGTCTGGACCGTTCTCCGTATAGGCCCTC 
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PS 

DNPA D H DAM 

RLUU D A 0 L N 

AAM9 E E E U L 

2416 1 3 111 

CACCCTCCCCCTCACCTAAGCCCACCCCAAAGGCCAAACTCTCCACTCCCTCACCTCGCA 

2101 ♦ ♦ ♦ — ♦ ♦ ♦ 2160 

CTGGGACGCCGACTOGATTCCGCTGGGGTrrCCCGTTTGAGAGGTCAGGGAGTCGACCa 

H B 

I M MM P BS . 

N N AS S AP 

F L EO T Ml 

1 1 32 1 22 

CACCTTCTCTCCTCCCAGATTCCAGTAACTCCCAATCTTCTCTCTGCAGAGCCCAAATCT 

2161 ♦ ♦ ♦ ♦ -.♦ 2220 

CTGGAAGAGAGGAGGGTCTAAGGTCATTGAGGGnAGAAGAGAGACGTCTCGGGTTTACA 

E P K S - 

N BBS BS 

M NS SSC SC HS M 

A LP PTR TR AT N 

E AH INP MP EU L 

3 31 211 11 31 1 

/ til 
TGTGACAAAACTCACACATCCCCACCGTCCCCAGGTAAGCCACCCCAGGCCTCGCCCTCC 

2221 — ♦ ♦ ♦ ♦ ♦ — ♦ 2280 

ACACTGTTTTGACTGTGTACCGCTGGCACCGGTCCATTCCGTCGGCTCCCGAGCGGGACC 
CDKTHTCPPCP 

B 

AM B N SM P 

L N A L PA 0 

U L N A IE K 

11 1 4 21 1 

AGCTCAACGCGGGACAGGTGCCCTACAGTAGCCTGCATCCAGGGACAGCCCCCACCCGGG 

2281 ♦ ♦ ♦ — * * * 2340 

TCGAGTTCCGCCCTGTCCACGGGATCTCATCGGACGTAGGTCCCTGTCCGGGGTCGGCCC 

BS S 

AM M M D M SC MANAM 

p A B NO N TR B VLU B 

LEO LE L NP0AA9 0 

3 2 2 1 1 1 11 2 246 2 

TGCTCACACGTCCACCTCCATCTCnCCTCAGCACCTGAACTCCTGGGGGGACCGTCAGT 

2341 ♦ ♦ ♦ ♦ ♦ ♦ 2400 

ACGACTGTGCAGGTGGAGGTAGAGAAGGAGTCGTGGACnGAGGACCCCCCTGGCAGTCA 

APELLGCPSV- 
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SC 
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DHNA 
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RALU 


PCR 
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AEA9 


AlP 


11 
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2346 


211 
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/ 
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s ss 

MS AN y HMANNAC DM M 

NT UL N PNVCLUR DS A 

L Y 3A L ALAIA9F ET E 

11 A3 1 2121461 12 3 

/ / // / 

CrrCCTCnCCCCCCAAAACCCAAGGACACCCTCATGATCTCCCCGACCCCTGAGGTCAC 

2401 ♦ ♦ ♦ ♦ ♦ ♦ 2460 

CAAGGACAACGGGGGmTGGGnCCTGTGCGACTACTAGACCCCaCGGGACTCCAGTG 

FLFPPKPKDTLMISRTPEVT- 
N 

NS M M DM M RM M 

LP A N DS B SA N 

AH E L ET 0 AE L 

31 2 1 12 2 12 1 

ATCCGTGGTGCTGGACGTGAGCCACGAAGACCCTGAGGTCAAGnCAACTCGTACCTGGA 

2461 ♦ ♦ ♦ — ♦ ♦ 2520 

TACGCACCACCACCTGCACTCGGTGCnCTGCGACTCCAGTTCAAGnGACCATCCACCT 

CVVVDVSHEOPEVKFNWYVD- 

F FN 

M N NSS R M R 

N U UFA S .AS 

L 4 DBC A E A 

1 H 222 1 2 1 

// 

CGGCGTGCAGCTGCATAATGCCAACACAAAGCCCCCCCAGGAGCAGTACAACAGCACGTA 

2521 * " - * 2580 

GCCCCACCTCCACGTAnACGGTTCTGTTTCGGCGCCCTCCTCGTCATGTTGTCGTGCAT 

GVEVHNAKTKPREEQYNSTY- 

S BS 

HNC HH M SC R 

. PCR GP N TR S 

AIF AH L NF A 

211 11 1 11 1 

CCGGGTGGTCAGCGTCaCACCCTCCTCCACCACGACTGGCTGAATGGCAAGGAGTACAA 
GGCCCACCAGTCGCAGGAGTGGCAGGACGTGCTCCTGACCGACTTACCGTTCCTCATCTT 
RVVSVLTVLHQDWLNGKEYK- 

M T 
N A 

L q 
1 1 

GTGCAAGGTCTCCAACAAAGCCCTCCgACCCCCCATCGAGAAAACCATCTCCAAAGCCAA 

2541 .—4— — ♦ ♦ ♦ ♦ ♦ 2700 

CACGTTCCAGAGGTrGTTTCGGGAGCGTCGGCGCTAGCTCTTTTGGTAGAGGTnCGGTT 
CKVSNKALPAPIEKTISKAK- 



* 2640 
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P S S S 
ADNNPMA A H M N HHN BSAH 
VRLLUNU U A N L APA GFUA 
AAAAML9 9 E L A EAE LI9E 
2244116 6 3 1 3 321 1163 
//// / / 
AGGTGCGACCCGTGGGGTGCGACGGCCACATGCACAGAGGCCGGCTCGGCCCACCCTCTG 
2701 ♦ ♦ * ♦ ♦ ♦ 2760 

TCCACCCTGGGCACCCCACGCTCCCGGTGTACCTGTCTCCCGCCGAGCCCGGTGCGAGAC 
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CCCTGAGACTGACCGCTCTACCAACCTCTGTCCTACAGGGCAOCCCCGAGAACCACAGGT 
2761 — 4 ♦ — — ♦ ^— — ♦ 2820 

GGGACTCTCACTCGCGACATGGTTGGAGACAGGATGTCCCCTCGGGGCTCnCGTGTCCA 

GQPREPqv- 

SS BS BS 

R F AHNNCCS A F SC SC 

S 0 VPCCRRM L 0 TR TR 

A K AAIIFFA U K NF NF 

1 1 1211111 1 1 11 11 

///// / / 

GTACACCCTGCCCCCATCCCGGCATGAGCTGACCAAGAACCAGGTCAGCCTGACCTCCCT 

2821 ♦ ♦ ♦ ♦ ♦ 2880 

CATGTGGCACGGGGGTAGGGCCCTACTCGACTGGTrCTTGGTCCAGTCGGACTGGACGGA 

YTLPPSROELTKNQVSLTCL- 

B F 
S N H 

P UP 
M 4 A ' 

I H 2 

GGTCAAAGGCnCTATCCCAGCGACATCGCCGTGCAGTGGGACAGCAATGCGCAGCCGCA 
2881 4 ♦ ♦ ♦ ♦ ♦ 2940 

CCAGTrTCCGAAGATAGGGTCCCTCTAGCGGCACCTCACCCTCTCGTTACCCGTCOGCCT 
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E W 
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1 1 
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GAACAACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGGCTCCnCTTCCTCTACAG 

2941 ♦ ♦ 4 4 ♦ ♦ 3000 

CnGTTGATGTTCTGGTGCGGAGGCCACGACCTGAGGCTGCCGAGGAAGAAGGAGATGTC 

NNYKTtPPVLOSOGSFFLYS- 
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M A 

N L 
L U 
1 1 



B 
S 
P 
M 
1 



F 

NM 
UB 
40 
H2 



MBX 
ABM 
EVN 
211 



NF M 

LA N 

AN L 

31 1 
/ 



S 



CAAGCTCACCGTGGACAACAGCACGTGGCAGCAGGGGAACGTCnCTCATCaCCCTGAT 

♦ * ♦ ♦ * ♦ 3 

GTTCGAGTOGCACCTCnCTCGTCCACCGTCGTCCCCTTCCAGAACAGTACGAGCCAaA 

KLTVOKSRWqilGNVFSCSVM- 



GCATGAGGCTCTGCACAACCACTACACCCAGAAGAGCCTCTCCCTCTCTCCCCGTAAATG 
3061 4 4 ♦ ♦ ^ 

CCTACTCCGACACCTGnCGTGATCTGCGTCnCTCCGACAGCGACAGAGGCCCATnAC 
HEALHNHYTQKSLSL SPGK. 



CXHHN 
FMAPA 
RAEAE 
13321 
/ 

ACTGCGACGGCCG 

3121 — 3133 

TCACGCTCCCGGC 



S 



N N 
S L 
I A 
1 3 



M M HNC 

B N PCR 

0 L AZF 

2 1 211 
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FN S B 

N S B M H OHA S 

UP B N G RAU T 

4 8 V L A AE9 X 

H 2 1 1 1 236 1 

OCCTGTTTGAGAAGCAGCGCCCAAGAAAGACGCAAGCCCAGAGCCCCTGCCATTTCTOTG 

1 ♦ ■» ♦ ♦ ♦ ♦ 60 

CGGACAAACTCTrCGTCGCCCGnCTrrCTGCGnCGGGTCTCCGCGACCGTAAAGACAC 

B PS S S 

DBS ADNPA D DHNA M HM HNC 

DAP VRLUU D RALU N AN PCR 

ENl AAAM9 E AEA9 L EL AIF 

122 22416 1 2346 1 31 211 

/ / // / / / 
GGCTCAGGTCCCTACTGCCTCAGGCCCCTGCCTCCCTCGGCAAGGCCACAATGAACCGGG 
61 ♦ ♦ 4 ♦ ♦ ♦ 120 

CCGAGTCCAGGGATGACCGAGTCCGGGGACGGAGGGAGCCGnCCCGTGnACTTGGCCC 
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12 


H 1 
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GAGTCCCTTTrAGGCACnCCTTCTGGTGCTGCAACTGGCGCTCCTCCCAGCAGCCACTC 

121 ♦ ♦ 180 

CTCAGGGAAAATCCGTGAACGAAGACCACGACGnGACCGCGAGGAGGGTCGTCGGTGAG 

VPFRHLL LVLQL ALLPAATQ- 

B E E R A 

B C C S L 

V 0 0 A U 

1 K K 1 1 

AGGGAAAGAAAGTGGTGC7GGGCAAAAAAGGGGATACACTGGAACTGACCTGTACAGCTT 
181 * — ♦ 4 24C 

TCCCTTrCTnCACCACGACCCGTTTrTTCCCCTATGTCACCTTGACTGGACATGTCGAA 
GKKVVLGKKGDTVELTCTAS- 

H 

MM I 
B B N 
0 0 F 
2 2 1 

CCCAGAAGAAGAGCATACAATTCCACTGGAAAAACTCCAACCAGATAAAGAnCTGGCAA 
2^1 •» ♦ ♦ ♦ ♦ ♦ 300 

GGGTCnCTTCTCGTATGTTAAGGTCACCTTTrTGAGCnGCTCTATnCTAAGACCCTT 

QKKSIQFHWKNSNQIKILGN- 
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N H 
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U 


U H 


N 
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3 


D A 


F 


1 


A 


2 1 


1 



B S 

NBS F AA 

LAP 0 VU 

ANl K Ad 

422 1 26 

ATCAGGCCTCCTTCrrAACTAAAGGTCCATCCAACCTGAATCATCCCCCTCACTCAAGAA 

301 — — 4 * ♦ ♦ ♦ 360 

TAGTCCCGAGGAACAATTGATnCCAGCTACGTrCGACnACTACCCCGACTGACncn 
QGSFLTKGPSKLNORADSRR- 

S S H H 

MANAS BA I A ID 

BVLin CU N F NO 

0AA9Y L3 F L F E 

22461 lA 12 11 

CAAGCCTTTGGGACCAAGGAAACnCCCCCTGATCATCAAGAATCTTAAGATAGAAGACT 

361 ♦ 4 ♦ ♦ ♦ 420 

' CnCGGAAACCCTGGTTCCTTTCAAGCCGGACTAGTAGnCTTAGAATTeTATCTTCTGM ~ - 
SLWOQGNFPLIIKNLKIEOS- 

S 

M M AMAM M 

B N VNUN A 

0 L AL9L E 

2 1 2161 1 

aGATACTTACATCTGTGAAGTGGAGGACCAGAAGGAGGAGGTCCAATTGCTAGTGTTCG 

421 - - * 

CTC7ATGAATGTAGACACT7CACCTCCTGGTCTTCCTCCTCCACGTTAACGATCACAAGC 

OTYlCEVEDQKEEVqLLVFG- 

B 

s s 

P T 
M Y 
1 I 

CATTCACTGCCAACTCTGACACCCACCTGCTTCACGGCCAGACCCTGACCCTGACCTTCG 
481 * 540 

CTAACTGACGCnGAGAClGTGGGTGGACGAAGTCCCCGTCTCGGAfTGGCACTGCAACC 

LTAN SDTHLLQGQSLTLTLE- 

B BS H 

BS SC D MIS 

AP TR D N N T 

Nl NF E L F Y 

22 11 1 111 

/ / 

ACACCCCCCCTGGTAGTAGCCCCTCAGTGCAATGTAGGAGTCCAAGGGCTAAAAACATAC 

541 4 ♦ — 4 4 — 4 4 600 

TCTCGGGGGGACCATCATCGGGGAGTCACGTTACATCCTCAGGnCCCCATTrTTGTATC 
SPPGSSPSVQCRSPRGKNIQ. 
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N BBH S B 6S 

M MO ASP A BSSGSC S B N SC 

8 ND LPV L APTIAR T A L TR 

0 LE U8U U NINACF X N A NF 

2 11 122 1 221111 1 1 4 11 

// //// / 
AGCGCGGGAACACCCTCTCCOTGTCTCACCTCGAGCTCCAGGATAGTGCCACCTGGACAT 

601 — 4 ♦ ♦ — ♦ * ♦ 660 

TCCCCCCCnCTGGCAGAGCCACAGAGTCGACCTCGACGTCCTATCACCGTGGACCTGTA 

CCKTLSVSQLELQOSCTWTC- 

N 

N5 M NM A 

LP B HA L 

AH 0 EE U 

31 2 11 I 

GCACTGTCTTCCAGAACCAGAAGAAGGTGGACn£AAAATAeACATCGTGGTGCTAC-CTX.. 

661 ♦ 4 4 4 4 4 720 

CGTGACAGAACGTCnCGTCncnCCACCTCAAGTTnATCTGTAGCACCACGATCGAA 
TVLQNQKKVEFKIOIVVLAF- 

HS MM 

AT N N 

EU L L 

31 11 

TCCAGAAGGCCTCCAGCATAGTCTATAAGAAAGAGGGGGAACAGGTGGAGnCTCCTTCC 

721 4 4 4 4 4 780 

AGGTCTTCCGGAGGTCCTATCACATATTCTTTCTCCCCCnGTCCACCTCAAGAGGAAGC 
QKASSIVYKKEGEQVEFSFP- 

A AM 

L L N 

U U L • 

1 11 

CACKGCCTTTACAGTTGAAAAGCTGACGGCCAGTGGCGAGCTGTGGTGGCAGGCGGAGA 
76: ♦ 4 ♦ ♦ B^C 

CTGASCGGAAATCTCAACnrrCGACTGCCCGTCACCGCTCGACACCACCGTCCGCCTCT 

LAFTVEKLTGSGELWWQAER- 
P S 

H M FV A M 
P N LN U B 
H L ML 3 0 
1 1 11 A 2 

GGGCTrCCTCCTCCAAGTCnGGATCACCrrTGACCTGAACAACAAGGAAGTGTCTGTAA 
841 — ♦ « ♦ ♦ ♦ ♦ 900 

CCCGAAGGAGGAGGnCACAACCTAGTCGAAACTGGACncncnCCnCACAGACAn 

ASSSKSWITFDLKNKEVS VK- 
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6 BS PS 

SM SCADNPAD A AH 

TA TRVRLUUD L LP 

EE NFAAAM9E U U H 

23 11224161 1 1 1 

AACGGGnACCCACGACCCTAAGCTCCAGATCCGCAAGAAGCTCCCGCTCCACCTCACCC 

901 ♦ ♦ ♦ ®60 

TTGCCCAATCGGTCCTCCGATrCGAGCTCTACCCGTTCTTCGAGGGCGACGTGGACTGCG 

RYTQDPKLqMGKKLPLHLTL- 

BS BSS 

M SC HS 0 M H SCAHM 

N TR AT 0 N P TRUAN 

L NF EU E L H NF9EL 

1 11 31 1 11 11631 

/ / / / 

TOCCCCAGGCCnCCCTCAGTATGCTGGCTCTCGAAACCTCACCCTGGCCCTTGAAGCGA 

961 4 «- ♦ ♦ ♦ — 1020 

ACGGGGTCCCGAACCGAtiTCATACCACCCAGACCTTTCGAGTSGGACCGCGAACTTCCCT 

PHALPQYACSCNLTLALEAK. 

S BS 

F SC H D A 

A TR POL 
N NF HE U 

1 11 1 1 1 

AAACAGCAAAGTTGCATCAGGAAGTGAACCTGGTGGTGATGAGAGCCACTCAGCTCCAGA 

1021 1080 

TTTGTCCTTTCAACGTAGTCCTTCACnCGACCACCACTACTCTCGGTGAGTCCAGGTCT 
TGK LHQ. EVNLVVMRATQLQK- 

PS S 

W ADNNPA OF AM OE A 

N VRlLUU OA LN OS L 

L AAAAM9 EN UL EP U 

] 224416 11 11 11 1 

AAAATTTGACCTGTCAGGTGTCGGGACCCACCTCCCCTAAGCTCATGCTGAGCnCAAAC 

1081 ♦ ♦ ♦ * * 

TTTTAAACTGCACACTCCACACCCCTGGGTGGAGGGGATTCCACTACGACTCCAACTnC 

NLTCEVWCPTSPKLMLSLKL- 

M T H M Oy 

N A P N OS 

1 Q A L ET 

1 1 2 1 12 

TGGAGAACAACGAGGCAAAGGTCTCGAAGCGGCAGAAGCCGG7GTGGGTCCTGAACCCTG 

1141 4 ♦ ♦ 1200 

ACCTCTTCnCCTCCGTnCCAGAGCnCGCCCTCnCGGCCACACCCACGACTTGGGAC 

ENKEAKVSK REKPVWVLNPE- 
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H PS H 

F 0 M I A ADPA I 

0 D A N V VRUU N 

K E E F A AAM9 F 

12 3 11 2216 1 

AGGCCCOGATGTCCCACTCTCTCCTCACTGACTCCCCACACGTCCTGCTCGAATCCAACA 
1201 ♦ ♦ 4 ♦ 4 ♦ 1260 

TCCGCCCCTACACCGTCACAGACGACTCACTGAGCCCTGTCCAGGACGACCnAGGTTCT 
AGMWQCLLSOSOqVLLESNI. 

S SA 6HF BS H 
ANA HNCP SGNMAANXA RSD I A 
VLU PCRA PIUNMULHV SCD N L 
AA9 AIFL 1AOLH3A0A AAE D U 
236 ^111 21211A421 1113 1 

TCAAGGTTCTGCCCACATCGTCCACCCCGGTGCACCCGGATCCCCAGGGTGACTACTAAC 
1261 — — — 4 * 4.0.......4 ♦ ^ 1320 

AGnCCAAGACGGGTCTACCACGTGGGGCCACCTGCGCCTAGGGCTCCCACTCATGAnC 

KVLPTWSTPVHAOPE 

E BS 55 F BS F 

H CHH F SC HHNCF -N BSC N 

P OHA 0 TR PGCRA U BTR U 

H 4AE K NF AAIFN 4 VNF 4 

I 712 1 11 21111 H 111 H 

CnCACCCCTCCTGCCTGGACGCATCCCCGCTATGCAOCCCCAGTCCAGGGCACCAAGGC 



1321 



GAAGTCGCCAGGACGGACC7GCGTAGCGCCGATACGTCGGGCTCACGTCCCGTCGTTCCG 



S 5 

DBHVHNft HVNCN M MNDM 

RBA5PLU PNCRL N NLDB 

AVE0HA9 A.IFA L LAEO 

2132146 21114 1 1312 

// // // 
AGGCCCCGTCTGCacnCACCCGGAGCCTCTCCCCGCCCCACTCATGCTCAGGGAGAGG 

1381 4 — — 4— — — 4 — — — 4-........4 1440 

TCCGCGGCAGACGGAGAAGTGGGCCTCGGAGACGGGCGGGGTGAGTACGAGTCCCTCTCC 

?S P B BS S 

SC F M BN S SCDHA 

j; L A A L P TRRAU 

M E N A 1 NFAE9 

11 1 114 2 11236 

GTCnCTGGCTTTTTCCCAGGCTCTGGGCAGCCACAGGCTAGGTGCCCCTAACCCAGGCC 

i^^i : ♦ 4 1500 

CAGAAGACCGAAAAAGGGTCCCAGACCCGTCCGTGTCCGATCCACGCGGAnGGGTCCGG 
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PS 


s . 


DBS 


S 


M 


HNC 


ADNPA 


P 


DAP 


P 


N 


PCR 


VRLUU 


M 


ENl 


M 


L 


AIF 


AAAV9 


1 


122 


1 


1 


211 


22416 




/ 






/ 


/ // 



CTCCACACAAAGCGGCAGGTGCTGGGCTCACACCTGCCAAGAGCCATATCCGGGAGGACC 

1501 — ♦ * ♦ ♦ ♦ ♦ 1560 

CACCTGTGTnCCCCGTCCACGACCCCAGTCTCGACGGnCTCGGTATAGGCCCTCCTGC 

D H DAM 

D A D L N 

E E E U L 

1 3 111 

CTCCCCCTGACCTAAGCCCACCCCAAAGGCCAAACTCTCCACTCCCTCAGCTCCCACACC 
1561 ♦ ♦ ♦ ♦ ♦ ♦ 1620 

GACGGGGACTCGAnCGGGTCCGGTnCCGGTnGACAGGTCACGGACTCCAGCCTCTGC 



H 








8 




I 


M 


MM 


P 


BS 


M 


N 


N 


AB 


S 


AP 


A 


F 


L 


EO 


T 


NX 


E 


1 


1 


32 


1 


22 


3 






/ 




/ 





TTCTCTCCTCCCAGATTCCAGTAACTCCCAATCTTCTCTCTCCAGAGCCCAAATCTTGTG 

1621 ♦ *' ♦ 168C 

AACAGAGCAGGGTCTAAGGTCATTGAGGGTTAGAACAGAGACCTCTCGGGTTrACAACAC 

E P K S C D - 

N BBS BS 

NS SSC SC HS MA 

LP PTR TR AT N L 

AH IN- NF EU L U 

3: 211 11 31 11 

/ / / / 

acaa;a:tcacacatgccca::gtgcccaggtaagccagcccaggcctcgccctccagct 

16E1 - ♦ — ♦ — ■» ♦ 1740 

TGTTnCAGlGTGTACCGCTGGCACGGGTCCAnCCGTCGGGTCCGGAGCGGGACGTCGA 



KTHTCPPCP 







B 


BS 


S 


S 


S 


M 


6 N 


SM F 


SC 


F 


DHNA 


HNC 


N 


A L 


PA 0 


TR 


A 


RALU 


PCR 


L 


N A 


IE K 


NF 


N 


AEA9 


AIF 


1 


1 4 


21 1 


11 


1 


2346 


211 








/ 




/ 


7 



CAAGGCGGGACAGGTGCCCTAGAGTAGCCTGCATCCAGGGACAGGCCCCAGCCGGGTCCT 

1741 ♦ ♦ ♦ ♦ ♦ 1800 

GnCCGCCCTGTCCACGGGATCTCATCGGACGTAGGTCCCTGTCCGCGGTCGGCCCACGA 
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BS S 

A M M M D M SC M ANA M 

FAB N D N TR 8 VLU B 

LEO LE L NF0AA9 0 

3 2 2 1 1 1 11 2 246 2 

CACACGTCCACCTCCATCTCTTCCTCACCACCTGAACTCCTCCGGCGACCCTCAGTCnC 
1801 ♦ ♦ ♦ ^ 1850 

CTGTGCAGGTGGAGGTAGAGAAGCAGTCGTGGACTTGAGGACCCCCCTCGCACTCAGAAG 

APELLCCPSVF - 

S SS N 

MS AN M HMANNAC DM M NS 

NT UL N PNVCLUR OS A LP 

L Y 3A L ALAIA9F ET E AH 

11 A3 1 2121461 12 3 31 

CTCnCCCCCCAAAACCCAAGGACACCCTCATGATCTCCCGGACCCCTGAGGTCACATGC 
1861 ♦ •» 1920 

GAGAAGGGGGGTrTTGGCTTCCTGTGGGAGTACTACAGGGCCTGGGCACTCCACTGTACC 
LFPPKPKDTLMISRTPEVTC - 

M M OM M RM M 

A N OS B SA N 

E L ET 0 AE L 

2 1 12 2 12 1 

GTGGTGGTGGACGTGACCCACGAAGACCCTGAGGTCAACrrCAACTGGTACGTGCACGGC 

1921 - ♦ ♦ — - 1980 

CACCACCACCTGCACTCGCTGCnCTGCGACTCCACnCAAGTTGACCATGCACCTGCCG 

VVVDVSHEOPEVKFNWYVOC - 

F FN S 

V N NS: R MR HNC 

N U UPA S AS PCR 

L 4 DBC A E A AIF 

1 H 222 1 2 1 211 

GTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTACAACAGCACGTACCGG 
1981 ♦ ♦ ♦ ♦ 2040 

CACCTCCACGTAnACGGTTCTGTTTCGGCGCCCTCCTCGTCATGnGTCGTGCATGGCG 
VEVHNAKTKPREEQYNSTYR - 



t 
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BS 

HH M SC R 

CP N TR S 

AH L NF A 

" 1 11 1 

^ GTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTCGCTCAATCGCAACGAGTACAAGTGC 
20*1 ♦ ♦ ♦ « ♦ ^ 2100 

CACCAGTCCCAGGAGTGGCAGGACGTGGTCCTCACCGACnACCGnCCTCATGrrCACC 

VVSVLTVLHQOWLNCKEYKC - 

M T 
N A 
L Q 
1 1 

. AAGG7CTCCAACAAAGCCCTCCCAGCCCCCATCGAGAAAACCATCTCCAAAGCCAAAGGT 
TTCCAGAGGTTCTnCGGGAGGCTCGGCGGTAGCTCTmGGTAGAGGTrTCCGTTTCCA 
K V S N K A L - P— A P I E - K t I S T A K 

P S S s 

ADNNPMA A H M N HHN BSAH D 

VRLLUNU UAN L APA GFUA 0 

AAAAML9 9 E L A EAE LI9E E 

2244116 6 3 1 3 321 1163 1 
//// / / 

GGGACCCGTGGGGTCCGAGGGCCACATCGACACACGCCGCCTCCGCCCACCCTCTGCCCT 
2161 4 ♦ 4— — ♦ 2220 

CCCTGGGCACCCCACGCTCCCGGTGTACCTG7CTCCGGCCCAGCCGGGTGCGAGACGGGA 

N F 

MM SR MNA B RF 

NA PS NUV B SO 

LEB A L4AV A K 

13 2 1 1 H 1 I 11 

CAGAGTGACCGC7GTACCAACCTCTG7CCTACAGCCCAGCCCCGAGAACCACAGGTGTAC 
2221 ♦ ♦ 4 4— 4 — 2280 

C7CTCAC7GGCGACA7GG77GGAGACAGGA7G7CCCG7CCGGCC7CnGC7G7CCACA7G 

GQPREPQV Y - 
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ss 








BS B 


AHNNCCS 


A 


F 


sc 


SC s 


VPCCRRM 


L 


0 


TR 


TR P 


AAIIFFA 


U 


K 


NF 


NF M 


1211111 


1 


1 


11 


11 1 


///// 






/ 


/ 



ACCCTGCCCCCATCCCCGCATGAGCTGACCAACAACCAGCTCAGCCTCACaCCCTCCTC 

2281 ♦ ♦ ♦ ♦ — — — 4 2340 

TGGCACCGGCGTAGGGCCaAnCCACTCGTTCTTCGTCCAGTCCGACTGGACCGACCAG 

TLPPSROELTKNQVSLTCLV - 

F 

N H B 
UP B 
4 A V 

H 2 1 

AAACGCTTCTATCCCACCGACA7CGCCGTGGAGTGGGAGAGCAATCCGCACCCGGAGAAC 

2341 r—trr ♦ ♦ •» ♦ 2400 

TTTCCGAAGATAGGGTCCCTCTAGCCGCACCTCACCCTCTCOTTACCCGTCCGCCTCTTG 

KGFYPSDIAVEWESNGQPEN - 

H 

M I M N H MA 

N N B L P N L 

L F 0 A H L U 

112 4 1 11 

AACTACAAGACCACGCCTCCCGTCCTGGACTCCGACGGCTCCT7CTTCCTCTACAGCAAG 

2401 ♦ 4 4 4 — . 4 4 2460 

nGATGnCTGGTGCGGAGGGCACGACCTGAGGCTGCCGACGAAGAAGGAGATGTCGnC 

NYKTTPPVLOSDCSFFLYSK - 

B F S 

S NM MBX NF M N 

P U6 ABM LA N S 

M 40 EVN AN L 'I 

1 . H2 211 31 1 1 

/ 

C7CACCGTGGACAAGAGCAGGTGGCAGCAGGGGAACGTCT7CTCATGCTCCGTGATGCAT 

2461 4 4 4 4 4 4 2520 

GACTCCCACCTGTTCTCGTCCACCCTCGTCCCCnGCAGAAGAGTACGAGGCACTACGTA 
LTV0KSRWQQGNVF5CSVMH • 
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s 



N 
L 
A 

3 



M M HNC 

B N PCR 

0 L AlF 

2 1 211 



/ 



GAGCCTCTGCACAACCACTACACCCAGAAGAGCCTCTCCCTGTCTCCGGGTAAATGAGTG 



CTCCGAGACGTGnGGTGATGTGCGTCnCTCGGAGAGCGACAGAGGCCCATTTACTCAC 

EALHNHYTQ KSLSLSPGK» 

CXH 
FMA 
RAE 
133 
/ 

CGACGGCCG 

2581 2589 

GCTGCCGGC 



2521 



♦ 2580 
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FN SB 

N S B M H DHA S 

UP B N C RAU T 

4 B V L A AE9 X 

H 2 1 1 1 236 1 

GCCTGTrTCAGAAGCAGCGGGCAAGAAAGACGCAAGCCCAGACGCCCTGCCAnrCTGTG 

1 ♦ ♦ ♦ 4 60 

CGGACAAACTCTTCGTCGCCCGTrCTnCTGCCTTCCGGTCTCCCCGACGGTAAAGACAC 

B PS S S 

DBS ADNPA D DHNA M HM HNC 

OAP VRLIW D RALU N AN PGR 

ENl AAAM9 E AEA9 L EL AlF 

122 22416 I 2346 1 31 211 

/ /// / / _ / 
CGCTCAGCTCCCTACTGGCTCAGGCCCCTGCCTCCCTCGGCAAGGCCACAATGAACCCGG 

61 ..4 ♦ ♦ 4 .4— 120 

CCGAGTCCAGGGATGACCGAGTCCGGCGACGGAGGGAGCCGncCGGTCnACnCGCCC 

M N R G • 

H F F 

I B N HH N M D 

N B U HA UNO 

F V 4 AE 4 L E 

1 1 H 12 H 1 1 

CAGTCCCTrTTACGCACnGCTTCTGGTGCTGCAACTCGCCCTCCTCCCAGCAGCCACTC 

121 ♦ * ♦ ♦ ♦ 180 

CTCAGGCAAAATCCGTGAACCAAGACCACGACGnCACCGCGAGGAGCGTCGTCGGTGAG 

VPFRHLLLVLQLALLPAAT Q- 

B E E R A 

6 C C S L 

V 0 0 A U 

IKK 11 

AGGGAAAGAAAC7GGTGC7GGGCAAAAAAGGGGATACAGTGGAACTGACCTGTACAGCTT 

181 4 4 4 4 —4 4. 240 

TCCCTTTCrrrCACCACGACCCGTTTTTTCCCCTATGTCACCnGACTGCACATCTCGAA 
GKKVVLGKKGDTVELTCTAS- 
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s 


N Q 


I K 


I 




S 


F 


H 


A 


A 


N H 


I 


L 


U 


U H 


N 


U 


3 


D A 


F 


1 


A 


2 1 


1 



H 

V M I 
B B N 
CO F 

2 2 1 
CCCAGAACAAGACCATACAATTCCACTGCAAA/ACTCCAACCAGATAAAGATTCTCCGAA 

241 ♦ « — - * ♦ — « 300 

GGGTCTrCTTCTCCTATGnAAGGTCACCTTTTrGAGGTTCGTCTATTTCTAAGACCCTr 

QKKSIQFHWKNSNQIKI LGN- 

B S 

NBS F AA 

LAP 0 VU 

ANl K A9 

422 1 26 

ATCAGGGCTCCTTCmACTAAACGTCCATCCAAGCTCAATGATCGCGCTCACTCAAGAA 

301 ♦ 4 ♦ .-4———— 360 

TAGTCCCGACCAAGAAnGATTTCCAGGTAGCJTCGACTTACTAGCCCGACIGAGmn 

QGSFLTKGPSKLNDRAOSRR- 

S S H H 

MANAS BA I A ID 

BVLUT CU N F NO 

0AA9Y L3 F L F E 

22461 lA 12 11 

GAAGCCT7TGGGACCAAGCAAACTTCCCCCTGATCATCAAGAATCTTAACATAGAAGACT 

361 4 4 -4 , 4 4 ♦ 420 

CTTCGGAAACCCTGGnCCTnCAAGGGCGACTAGTAGTTCTrAGAArrCTATCTTCTGA 
SLWDQGNFPLIIKNLKIEDS- 

S 

M M AMAM M 

6 N VNUN A 

0 L AL9L E 

2 1 2161 1 

CAGATACTTACATCTGTGAAGTGGACGACCAGAAOGACGAGGTCCAAnCCTAGTCTrCC 

421 4— 4 4 4 4 4 480 

CTCTATGAATGTAGACACTTCACCTCCTGGTCT7CCTCC7CCACGTTAACCATCACAAGC 
OTYICEVEOQKEEVQLLVFC- 
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B 

S S 

P T 

M Y 

1 I 

GATTGACTGCCAACTCTGACACCCACCTGCnCAGGGGCAGACCCTCACCCTCACCTTCG 

481 4 * ♦ ♦ 540 

CTAACTGACGGTTGAGACTGTGGGTGGACGAAGTCCCCGTCTCGGACTGGGACTGGAACC 

LTANSDTHLLQGQSLTLTLE- 

B BS H 

B5 SC D V I S 

AP TR 0 N N T 

Nl NF E L F Y 

22 n 1 111 

AGAGCCCCCCTGGTAGTAGCCCCTCAGTGCAA7GTAGGAGTCCAAGGGGTAAAAACATAC 

541 ♦ 4 ♦ ♦ ♦ ♦ 600 

TCTCGGGGGGACCATCATOGGGAGTCACGnACATCCTCAGGTTCCCCATTnrTGTATG 

SPPGSSPSVQCRSPRGKMIQ- 

N BBH S B 65 

M MD ASP A BSSGSC S B N SC 

B ND LPV L AP7IAR T A L TR 

0 LE UBU U NINACF X N A NF 

2 11 122 1 221111 1 1 4 11 

AGGGGGGCAAGACCCTCTCCGTGTCTCAGCTGGAGCTCCAGGATAGTGGCACCTGGACAT 

601 --- * ♦ ♦ ♦ 660 

7CCCCCCCT7C7GGGAGAGGCACAGAGTCGACCTCGAGGTCCTATCACCGTGGACCTGTA 

GGKTLSVS QLELQDSGTWTC- 

NS W NM A 

LP B HA L 

AH 0 EE U 

31 2 11 1 

GCACTGTCTTGCAGAACCAGAAGAAGGTGGAGnCAAAATAGACATCGTGGTGCTAOCTT 

$61 ♦ ♦ ♦ — — ♦ 720 

CGTGACAGAACGTCTTGGTCTTCnCCACCTCAAGTTTTATCTGTAGCACCACGATCGAA 

TVLaNQKKVEFKIOIVVLAF- 
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HS MM 
AT N N 

EU L L 

31 1 1 

/ 

TCCAGAACGCCTCCAGCATAGTCTATAAGAAAGAGGGGGAACACGTGGACnCTCCrrCC 
721 ♦ 4 — ♦ — 4 4 780 

AGGTCnCCGGAGGTCGTATCAGATAnCTTTCTCCCCCnGTCCACCTCAAGAGGAAGG 
QKASSIVYKKEGEQVEFSFP- 

A AM 

L L N 

U U L 

1 1 1 

CACTCGCCTnACAGTTGAAAAGCTCACGGGCAGTGGCGACCTCTGGTGGCAGGCGCAGA 

781 4 4 4 4 840 

CTGAGCGCAAATGTCAACTTTrCGACTCCCCCTCACCGCTCGACACCACCGTCCGCCTCT 

L AFT V E K_ L T C S GEL % W Q A E R - 

P S 

H M FV A V 
P N LN U B 
H L ML 3 0 
1 1 11 A 2 

GGGCnCCTCCTCCAAGTCTTGGATCACCTTTGACCTGAAGAACAAGGAAGTGTCTCTAA 

841 4 4 -4 — ♦ ♦ —4 900 

CCCCAAGGAGGAGCTTCAGAACCTAGTGGAAACTGGACTTCnGnccnCACAGACAn 
ASS SKSWI T FD LKNK EVSVK- 
B BS PS 

SM SCADNPAD A AH 
TA TRVRLUUD L LP 
EE NFAAAV9E U U H 

23 11224161 1 1 1 

/ / / // 

AACGGCnACCCACGACCCTAAGCTCCACATGGGCAAGAAGCTCCCGCTCCACCTCACCC 

901 * 4 4 960 

TrcCCCAATGCGTCCTGGGATTCGAGGTCTACCCCnCTTCGAGGGCGAGGTGCAGTGGG 

RVTQO PKLQMCKKLPLHLTL- 

BS BSS 

M SC HS D M H SCAHM 

N TR AT 0 N P TRUAN 

L NF EU E L H NF9EL 

1 11 31 1 11 11631 

TGCCCCAGGCCTTCCCTCAGTATGCTGGC7CTGGAAACCTCACCCTGGCCCTTGAAGCGA 
961 4 ♦ * ^ ^ ^ JQ2Q 

ACGGGGKCGGAACGGAGTCATACGACCGAGACCTTTGGAGTGGGACCGGGAACnCGCT 
PHALPQYAGSCNLTLALEAK- 
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S BS 

F SC H D A 

A TR P D L 

N NF H E U 

in 111 

/ 

, *A*C^GCAAAGncCATCACCAACTCAACCTGCTGGTCATCACAGCCACTCACCTCCAGA 

1021 4 4 4 ♦ 1080 

TnCTCCTTTCAACGTAGTCCnCACnGGACCACCACTACTCTCGCTGAGTCGAGGTCT 
TGKLHQEVNLVVMRATQLITK- 

PS S 

M ADNNPA DF AV DE A 

N VRlLUU da LN OS L 

L AAAAV5 EN UL EP U 

1 - 224416 11 n 11 1 ~ 

AAAATTTGACCTGTGAGGTGTGGGGACCCACCTCCCCTAAGC7GATGCTGAGCTTCAAAC 
1081 ~2--~----4 ------ — 1140 

TTTTAAACTCGACACTCCACACCCCTGGGTGGACGGGAnCGACTACGACTCGAACTTTG 

NLTCEVWGPTSPKLMLSLKL- 

S T H M DM 

f A P N OS 

^ Q A L ET 

1 1 2 1 12 

TGGAGAACAACGAGGCAAAGGTCTCGAAGCGCCAGAAGCCGGTGTGGGTGCTGAACCCTC 
1141 4 — » — — ♦ ^ ^ 

ACCTCnGTTCCTCCGTTTCCAGAGCnCGCCCTCTrCGGCCACACCCACGACnGGGAC 

ENK EAKVSKREKPVWVLNPE- 

H PS H 

F D M I A ADPA I 

0 D A N V VRUU N 
K E E F A AAM9 F 

1 13 11 2216 1 

AGGCGGGGATGTGGCAG7GTCTGCTGAGTGACTCCGGACAGGTCCTGCTGGAATCCAACA 
1201 4 4 ♦ 4 ^ . ^ 1260 

TCCGCCCCTACACCGTCACAGACGACTCACTGAGCCCTGTCCAGGACGACCTTAGGnGT 
ACIIWQCLLSDSCQVLLESNI- 
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H 

RSD I A 
SCO N L 
AAE D U 
in 3 1 

/ 

TCAAGGT7CTGCCCACATCGTCCACCCCGGTCCACCCGCA7CCCGACCGTGAGTACTAAG 

1261 ♦ ♦ "» ♦ ♦ 1320 

AGTTCCAACACGCGTGTACCACCTGGGGCCACGTCCGCCTAGGGCTCCCACTCATGAnC 



s 


SA 


BHF BS 


ANA 


HNCP 


SGNMAANXA 


ft n I t 

VLU 


PCRA 


B Tt fltlll M kill 


AA9 


AIR. 


1ADLH3A0A 


236 


2111 


21211A421 


// 


// 


/ / / / 



K 


V L 


P 


T W S 


T P V H 


A D P 


E 






E 




BS 


ss 


F 


BS 


F 


H 


CHH 


F 


SC 


HHNCF 


N 


BSC 


N 


P 


OHA 


0 


TR 


PGCRA 


U 


BTR 


U 


H 


4AE 


K 


NF 


AAIFN 


4 


VNF 


4 


1 


712 


1 


11 


21111 


H 


111 


H 


/ 






/ 


II 




// 





CTTCACCGCTCCTGCCTGGACCCATCCCCGCTATCCAGC^CCAGTCCAGGGCAGCAAGGC 

1321 ♦ ♦ - - 1380 

GAACTCGCGAGGACGGACCTGCGTAGGGCCCATACGTCCGGGTCAGGTCCCG'rCCTTCCG 

S S 

DBHVHNA H».'N:N V VNDV 

RBABPLU PNCR. N N.Or 

AVE0HA9 ALIFA L LA£0 

2132146 21114 1 1312 

// // // 
AGGCCCCGTCTGCCTCnCACCCGGAGCCTCTGCCCGCCCCACTCATGCTCAGGCAGAGG 
1381 — - ♦ ♦ ♦ 1440 

tccggggcagacggagaagtgcgcctcgcagacgGgcgggctgagtacgagtccctctcc 

BS P B BS S 

SC F V B N S SCDHA 

TR L A A L P TRRAU 

NF M E N A 1 NFAE9 

11 1 114 2 11236 

/ / / 

GTCnCTGGCTTTTTCCCACGCTCTGGGCAGGCACAGCCTAGGTCCCCCTAACCCAGGCC 

1441 4 4 4 4 4 4 1500 

CAGAAGACCGAAAAAGGGTCCGAGACCCCTCCGTGTCCGATCCACGGGGATTGGGTCCGG 

B B B S PS 

S DBS S M HNC ADNPA 

P DAP . P N PCR VRLUU 

M ENl ML AIF AAAM9 

1 122 11 211 22416 

/ I I II 
CTGCACACAAAGGGGCAGGTGCTGGGCTCAGACCTGCCAAGAGCCATATCCGGGAGGACC 
1501 4 4 ♦-— 4 ♦ 1560 

GACGTGTGTTTCCCCGTCCACGACCCGAGTCTCGACGGnCTCGGTATACGCCCTCCTGG 
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D H 0AM 

I A 0 L N 

E E E U L 

1 3 111 

CTGCCCCTGACCTAACCCCACCCCAAACGCCAAACTCTCCACTCCCTCAGCTCGGACACC 
1561 ♦ -•» ■• ♦ * J520 

GACGGGGACTCGAnCGGGTGGCGTTTCCGGTTTGAGAGGTCAGGGAGTCGAGCCTCTCG 

LPLT.AHPKGQTLHSLSSDT - 
^ ^ P ^ ^ ^ ^ S J P $ A n 1 P . 

APOLSPPQRPNSPLPQLGHL- 

H S 

I M MM DF F 

N N AB DO A 

F L EO EK N 

1 1 32 11 1 

TTCTCTCCTCCCAGATTCCAGTAACTC-CCAATC7TCTC7CTCAGGGAGTGCATCCGCCCC 
1621 ♦ -» * ^ 2580 

AAGAGAGGAGGGTCTAAGGTCAnGAGGGTTAGAAGAGAGAGTCCCTCACGTAGGCGCGG 

G S A S A P - 

y c 

N 0 
L R 

1 1 

AACCCrmcCCCCTCGTCTCCTGTGAGAAncC . . . . ' 
1681 * * 1714 

TTGGGAAAAGGGGGAGCAGAGGACACTCnAAGG 

TLFPLVSCENS 
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Table 4 



6 PS 




S 


DBS ADNPA 


D 


DHNA 


DAP VRLUU 


D 


RALU 


ENl AAAM9 


E 


AEA9 


_122L22416 


_ 1 


2346 


/ / // 




/ 



FN S B 

N S B M H DHA S 

UP B N G RAU T 

4 B V L A AE9 X 

H 2 1 1 1 236 1 

GCCTGTTTGAGAACCACCGGGCAACAAAGACGCAAGCCCAGAGGCCCTGCCATTTCTCTG 
1 ♦ -4 ♦ ^ ^ .__4 50 

CGGACAAACTCnCGTCCCCCGTTCTnCTGCGnCGGGTCTCCGGCACCCTAAACACAC 

S 

tf HM HNC 
N AN PCR 
L EL AIF 
1 31 211 

CGCTCAGGTCCCTACTGGCTCAGGCCCCTGCCTCCCTCGCCAAGGCCACAATGAACCGGG 
61 ♦ 4 ^ ^ J20 

CCCAGTCCAGGGATGACCGAGTCCCGGCACCGAGCGAGCCGnCCGGTGnACnGCCCC 

M N R G - 

H F F 

I B N HH N M D 

N 6 U HA UN 0 

P V 4 . AE 4 L E 

1 1 H 12 H 1 1 

CAGTCCCTTTTAGGCACnGCnCTGGTGCTGCAACTGGCGCTCCTCCCAGCACCCACTC 

121 * «» ♦ ♦ ♦ 180 

CTCAGGCAAAATCCGTGAACGAAGACCACGACGnGACCGCGAGGAGGGTCGTCCGTCAG 

VPFRHLLLVLQLALLPAATQ- 

B E E R A 

B C C S L 

V 0 0 A U 

IKK ^ ^ 

AGGGAAAGAAAGTGGTGCTGGGCAAAAAAGGGGATACAGTGGAACTGACCTGTACAGCTT 
181 ♦ --- ^ 

TCCCTnCTTTCACCACGACCCGTTTTnCCCCTATGTCACCnGACTGGACATCTCCAA 
GKKVVLCKKGDTVELTCTAS- 
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H 

MM I 
6 B N 

0 0 F 
2 2 1 

CCCAGAACAAGACCATACAATTCCACTGGAAAAACTCCAACCAGATAAAGAnCTGGGAA 
241 * 300 

CGGTCTTCnCTCGTATGnAACGTGACCTTTTTGACGTTCCTCTATTTCTAACACCCTT 

QKKSIQFHWKNSNQIKILGN- 



K S 


I Q 


F H W K 


N S 


N Q 


X K 


I 


B 




S 




S 


F 


H 


NBS 


F 


AA 


A 


A 


N H 


I 


LAP 


0 


VU 


L 


U 


U H 


N 


ANl 


K 


A9 


U 


3 


D A 


F 


422 


1 


26 


1 


A 


2 1 


1 


/ 




/ 











ATCAGGGCTCCnCTTAACTAAAGCTCCATCCAAGCTGAATGATCGCCCTGACTCAAGAA 

301 — 4.-.---* 4 4 360 

TAGTCCCGAGGAAGAAnGATTTCCAGGTAGGTTCGACnACTAGCGCGACTGAGTTCTT 
QGSFLTKCPSKLNORAOSRR- 

S S H H 

MANAS BA I A ID 

BVLUT CU N F NO 

0AA9Y L3 F L F E 

22461 lA 12 11 

/ / 
GAAGCCTTTGGGACCAAGGAAACT7CCCCCTGATCATCAAGAATCTTAAGATACAAGACT 

361 4 4 4 4 4 4 420 

CTTCGGAAACCCTGGnCCTTTCAAGGGGGACTAGTAGnCTTAGAATTCTATCnCTGA 

SLWOQGNFPLI IKNLKIEDS- 

S 

M M AMAM M 

B N VNUN A 

0 L AL9L E 

2 1 2161 1 

CAGATACnACATCTGTGAAGTGGAGGACCAGAAGGAGGAGGTGCAAncCTAGTGnCG 

421 — —4 4 4 4 4 — 4 480 

GTCTATGAATGTAGACACTTCAtCTCCTGGTCnCCTCCTCCACGTTAACGATCACAAGC 
OTYICEVEDQKEEVQLLVFC- 
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6 

S S 

P T 

M Y 

1 1 
GAnOACTCCCAACTCTCACACCCACCTGCTTCAGCGCCAGACCCTGACCCTGACCTTCC 

<81 4 540 

CTAACTGACGGTTGAGAC7GTCGCTCCACGAAGTCCCCGTCTCCGACTCGGACTCCAACC 
LTANSDTHLLQGQ5LTLTLE- 

B B5 H 
BS SC D W I 5 

AP TR D N N T 

Nl NF E L F Y 

22 11 1 111 

/ / 

AGAGCCCCCCTGGTACTAGCCCCTCACTGCAATCTAGGAGTCCAACCGCTAAAAACATAC 

541 « ♦ ♦ • ♦ 600 

TCTCGCGGGGACCATCATCGGCCAGTCACG7TACATCCTCAGGTTCCC-CATT77TCTATG 
SPPGSSPSVQCRSPRGKNiq- 

N BBH S B 85 

M MD ASP A 6SSGSC S B N SC 

B ND LPV L APTIAR T A L TR 

0 LE UBU U NINACF X N A NF 

2 11 122 1 221111 1 1 4 11 

// / /// / 
ACCGGCGGAAGACCCTCTCCGTGTCTCAGCTGCAGCTCCAGGATAG7GCCACCTGCACAT 

501 4- * 4 4 4 ■--* 660 

TCCCCCCCnCTGGGAGAGGCACAGAGTCGACCTeGAOGTCCTATCACCCTGGACCTGTA 
GGKTLSVSQLELQOSGTWTC- 

N 

NS V NV A 

LP B HA L 

AH 0 EE U 

31 2 . 11 1 

GCACTGTCnGCAGAACCAGAAGAAGGTGGAGTTCAAAATACACATCGTGGTCCTAGCTT 

661 4 - 720 

CGTGACAGAACGTCTTGGTCncnCCACCTCAAGTTTTATCTGTACCACCACGATCGAA 
TVLQN5KKVEFKIDIVVLAF- 

HS MM 
AT N N 

EU L L 

31 1 1 

TCCAGAACCCCTCCAGCATAGTCTATAAGAAAGACCGGGAACAGCTGCAGTTCTCCTTCC 

721 ^80 

AGGTCnCCGGAGGTCGTATCAGATATTClT7CTCCCCCrrGTCCACCTCAA<5AGCAAGG 

QKASSIVYKKEGEQVEFSFP- 
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A AM 

L L N 

U U L 

1 1 1 

CACTCOCCTTrACACnGAAAAGCTCACGCCCAGTCGCGAGCTCTCGTCCCAGCCGGAGA 

781 4 * ♦ 4 ♦ e<o 

CTGAGCGGAAATGTCAACTTTTCGACTCCCCCTCACCGCTCGACACCACCGTCCGCCTCT 
LAFTVEKLTGSGELMKQAER- 

P S 

H M FM A V 
P N LN U B 
H L ML 3 0 
1 1 n A 2 
CGCCnCCTCCTCCAACTCnGGATCACCmGACCTCAAGAACAAGGAAGTGTCTGTAA 

841 ♦ ♦ ♦ ♦ 4 ♦ 900 

CCCGAAGGAGGAGGTTCACAACCTAGTCGAAACTGGACnCTTCnCCTTCACAGACAn 
ASSSKSWITFDLKNKEVSVK- 

B BS PS 

SM SCADNPAD A AH 
TA TRVRLUUO L LP 
EE NFAAAM9E U U H 

23 11224161 1 1 1 

/ / / // 

AACGGGnACCCAGGACCCTAAGCTCCAGATCGGCAAGAAGCTCCCGCTCCACCTCACCC 

901 ♦ 4 4 4 960 

TrGCCCAATGGGTCCTCGGArrCGAGGTCTACCCCnCTTCGAGGGCGAGGTGGAGTGGG 

H L T L - 



R 


V T Q 


D P 


K L 


q M G K K 


L P L 




BS 








BSS 


M 


SC HS 


0 


M 


H 


SCAHM 


N 


TR AT 


D 


N 


P 


TRUAN 


L 


NF BJ 


E 


L 


H 


NF9EL 


1 


11 31 


1 


1 


1 


11631 




/ / 








/ / 



TGCCCCACGCCnGCCTCAGTATGCTGGCTCTGGAAACCTCACCCTGGCCCTTCAAGCGA 

961 ♦ ♦ 4 4 — - 1020 

ACGGGGTCCGGAACGGAGTCATACGACCGAGACCTTTGGAGTGGCACCCGGAACTTCCCT 
PQALPQYAGSGNLTLALEAK- 

S BS 

F SC H D A 

A TR P D L 

N NF H E U 

1 11 111 

/ 

AAACACGAAAGTTGCATCAGGAAGTGAACCTGGTGGTGATGACAGCCACTCAGCTCCAGA 
1021 4 ♦ 4 4 ♦ 1080 

TTTGTCCrnCAACGTAGTCCTTCACTTGGACCACCACTACTCTCCGTGAGTCGAGGTCT 
TGKLHQEVNLVVMRATQLQK- 
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PS s 

M ADNNPA DF AM OE A 

N VRLLUU DA LN DS L 

L AAAAM9 EN UL EP U 

1 224416 11 11 11 1 

AAAATTTGACCTCTCAGGTGTGCGCACCCACCTCCCCTAACCTGATGCTGAGCnGAAAC 
1081 ♦ ♦ * jj4^ 

TnTAAACTCGACACTCCACACCCCTGGGTCGACGGGAnCGACTACGACTCGAACrrrC 
NLTCEVWCPTSPKLMLSLKL- 

W T H M DM 

N A P N OS 

L Q A L ET 

1 1 2 1 12 

TCCAGAACAAGGACCCAAACGTCTCGAAGCGCGACAAGCCCC7CTGGGTGCTGAACCCTG 
1141 .._4.-...- — 4 _ — ^ J 200 

ACCTCnGTTCC-TCCGTTTCCACAGCnCCCCCTCnCGGCCACACCCACGACnGGCAC 
ENKEAKVSKREKPViVLNPE- 

H PS H 

F D M I A ADPA I 

0 D A N V VRUU N 

K E E F A AAV9 F 

113 11 2216 1 

AGGCGGGGATGTGGCAGTCTCTGCTGAGTGAC7CCGCACAGCTCCTGCTGGAATCCAACA 

1201 * * ♦ ♦ ♦ 1260 

TCCGCCCCTACACCGTCACAGACGACTCACTGAGCCCTGTCCACGACCACCTTAGGTTGT 
AGMWQCLLSDSGQVLLESNI- 

S SA BHF BS H 

ANA HNCP SGNVAANXA RSO I A 

VLU PCRA PIUNMULHV SCD N L 

AA9 AIFL 1ADLH3A0A AAE D U 

236 2111 21211A421 111 3 1 

// ////// / 
TCAAGGnCTCCCCACATGGTCCACCCCGCTCCACGCGGATCCCGAGGGTCAGTACTAAG 

1261 •» ♦ ♦ * ♦ * 1320 

AGrrCCAAGACCOGTGTACCAGGTGGGGCCACCTCCGCCTACGCCTCCCACTCATCAnC 

OPE 

BS F 
BSC N 
BTR U 
VNF 4 
111 H 

CTTCAGCGCTCCTGCCTGGACGCATCCCGGCTATGCAGCCCCAGTCCAGCCCAGCAAGGC 
1321 ♦ 133: 

GAAGTCGCGAGGACGGACCTGCC7AGGGCCGA7ACGTCGGGGTCAGGTCCCCTCGTTCCG 
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S S 

OBHMHNA mUCH W MNDV 

RBABPlU PNCRL N NLD5 

AVE0HA9 ALIFA L LAEC 

2132146 21114 1 1312 

AGCCCCCGTCTGCCTCTTCACCCGGAGCCTCTCCCCCCCCCACTCATCCTCACGCAGAGG 

1381 ♦ ♦ ♦ l^^O 

TCCGGGGCAGACGGAGAAGTGGGCCTCGGAGACGGGCGGGGTGAGTACCAGTCCCTCTCC 

B as S 

M B N S SCDHA 

A A L P TRRAU 

E N A 1 NFAE9 

114 2 11236 

/ / 

GTCTrCTGGCrmTCCCAGGCTCTGCGCAGGCACAGGCTAGGTGCCCCTAACCCAGGCC 

1441 ♦ ♦ * ♦ * ♦ 1500 

CAGAAGACCGAAAAAGGGTCCGAGACCCGTCCGTCTCCGATCCACCGCGAnGGCTCCGG 



6S 


P 


SC 


F 


TR 


L 


NF 


M 


11 


1 


/ 





B 


B 


B 




S 


PS 


S 


DBS 


S 


M 


HNC 


AONPA 


P 


DAP 


P 


N 


PCR 


VRLUU 


M 


ENl 


M 


L 


AIF 


AAAV9 


1 


122 


1 


1 


211 


22416 




/ 






/ 


/ // 



CTGCACACAAAGGGGCAGGTGCTGGGCTCAGACCTGCCAAGACCCATATCCGGGAGGACC 

1501 ♦ 4 — * ♦ — * 1560 

GACGTG7GTTTCCCCGTCCACGACCCGAGTCTGGACGGTTCTCGGTATAGGCCCTCCTGG 

D H DAM 

D A D L N 

E E E U L 

1 3 111 

CTGCCCCTGACCTAAGCCCACCCCAAAGGCCAAACTCTCCAC7CCCTCAGCTCGGACACC 

1561 ♦ * ♦ 1620 

GACGGGGACTGGATTCGGGTGGGGTTTCCGGTTTCAGAGGTGAGGGAGTCGAGCCTGTGG 

H F 

I M MM BP DE AN 

N N AB BS DS LU 

F L EO VT EP U4 

1 1 32 11 11 IH 

/ / / 

TTCTCTCCTCCCAGAnCCAGTAACTCCCAATCTTCTCTCTCCAGTGATTGCTGAGCTCC 

1621 ♦ ♦ <♦ * 1680 

AAGAGAGGAGGGTCTAAGGTCATTGAGGCnAGAAGAGAGACGTCACTAACGACTCGACG 

V I A E L P - 
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F 

V H V U H 

B G N B U 

0 A L 0 D 

2 11 2 2 

CTCCCAAACTGACCCTCTTCCTCCCACCCCGCGACGCCTTCTrCGGCAACCCCCCCAAGT 

1681 ♦ • ZZ'* ^^^^ 

GACGGTnCACTCGCACAAGCAGGCTGCGGCGaCCCGAAGAAGCCGTTGGGCGCGTTCA 

PKVSVFVPPROGFFGNPRKS- 

BS S H B S F 

A SC H HNC I B SMC N 

L TR A PCR N B TNR U 

U NF E AIF F V NLF 4 

1 11 3 211 1 1 HI H 

/ // // 

CCAAGCTCA7CTGCCAGGCCACCGGTTTCAGTCCCCGGCACA7TCAGCTGTCCTCGCTGC 

1741 <• ♦ ♦ * 1803 

GGnCCACTAGACCGTCCGGTGCCCAAAGTCAGCGGCCGTCTAAGTCCACAG.GACCCACC 

KLICQATGFSPRQIQVSWLR- 

F B S BS H 

NH S H H AM AA SCM D H I 

UH P P G HA VU TRN DAN 

DA M HA AE A9 NFL E E F 

21 1 1 1 23 26 111 13 1 

/ / / 

GCGACGGCAACCAGGTGGGGTCTGCCGTCACCACGCACCAGCTCCAGGCTGAGGCCAA4G 

1801 ♦ « ♦ — ♦ ♦ • 1663 

CGCTCCCCT7CGTCCACCCCACACCCCAGTGGTGCCTGGTCCACGTCCCACTCCGGTT7C 

EGKQVGSGVTTOQVQAEAKE- 

SS B B 

AAHNAB5 SM H 

UUALPAP TA P 

99EAAN1 EE H 

6634122 23 1 

AGTCTGGGCCCACGACCTACAAGGTGACCAGCACACTGACCATCAAAGAC. . . . 

1861 ♦ ♦ ♦ * 1910 

TCAGACCCGGGTCCTGGATGnCCACTGCTCGTCTGACTGGTAGTTrCTC. . . . 

SCPTTYKVTSTLTIKE .... 
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FN SB 

N S B M H DHA S 

UP B N C RAU 7 

< B V L A AE9 X 

H 2 1 1 1 236 1 

CCCTGrrTCAGAAGCAGCCGGCAACAAAGACCCAAGCCCAGAOGCCCTCCCATnCTGTG 

1 ♦ ♦ •» ♦ 4 60 

CGGACAAACTCnCCTCGCCCGnCTTTCTGCGnCGGGTCTCCCGCACGGTAAAGACAC 

B PS S S 

DBS AONPA D OHNA M HM HNC 

DAP VRLUU 0 RALU N AN PCR 

ENl AAAM9 E AEA9 L EL AlF 

122 22416 1 2346 1 31 211 

I I II I - / / 

CGCTCAGGTCCCTACTGGCTCAGGCCCCTGCCTCCCTCGGCAACGCCACAATGAACCGGG 
61 « 4 4 ^ 

CCGAGTCCAGGGATGACCGAGTCCGGCGACGGAGGGAGCCGnCCGGTGnACnGGCCC 

M N R G - 

H F F 

J B N HH N M 0 

N B U HA UNO 

\ V 4 AE 4 L E 

1 1 H 12 HI 1 

GACTCCCTnTAGGCACTTGCnCTCGTGCTCCAACTCGCGCTCCTCCCAGCAGCCACTC 

121 ♦ 4 4 ♦ 4 ^ JQQ 

CTCAGGGAAAATCCCTGAACGAAGACCACGACGnGACCGCGAGGAGGGTCGTCGCTGAC 
VPFRHLLLVLQLALLPAATQ- 

B C C S L 

V 0 0 A U 

IKK 11 

AGGGAAAGAAAGTGGTGCTGGGCAAAAAAGGGGATACAGTGGAACTGACCTGTACAGCTT 

181 4 4 ......4-...-.. 4 4 — - --4 240 

TCCCTFrCTnCACCACCACCCCTTTTTTCCCCTATGTCACCnGACTGGACATGTCGAA 

GKKVVLCKKCOTVELTCTAS- 

H 

MM I 

B B N 

0 0 F 

2 2 1 

CCCAGAAGAAGAGCATACAAnCCACTGGAAAAACTCCAACCAGATAAAGAnCTGGCAA 
241 ♦ - ♦ ♦ 4 4 

GGGTCTTCnCTCGTATGTTAAGGTGACCTTnrTGAGGnGCTCTATTTCTAAGACCCTT 
QKKSI QFHWKNSNQIKILGN. 
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N H 
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VU 
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U H 
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ANI 
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A9 
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0 A 
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422 
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26 
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2 1 


1 


/ 




/ 











ATCAGGCCTCCTTCnAACTAAAGGTCCATCCAAGCTGAATGATCGCGCTCACTCAAGAA 

301 -4 4 4— 4 ♦ 360 

TAGTCCCGAGGAAGAAnGATTTCCAGGTAGGnCGACnACTAGCGCGACTGACTTCTT 
QGSFLTKGPSKLNDRAOSRR. 
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MANAS 


BA 


I 


A I 0 


BVLUT 


CU 


N 


F NO 


0AA9Y 


L3 


F 


L F E 


22461 


lA 


1 


2 1 1 


/ : ^ 


/ — 







GAACCCTTTCGGACCAAGGAAAC7TCCCCCTGATCATCAAGAATCTTAAGATAGAAGACT 

361 — - ♦ ♦ —4 ♦ 4 420 

CTTCGGAAACCCTGGnCCTTrGAAGGGCCACTAGTACTTCnAGAAnCTATCTTCTCA 

SLWDQGNFPLIIKNLKIE.DS- 

S 

M M AWAM M 

B N VNUN A 

0 . L AL9L E 

2 1 2161 1 

// 

CAGATACnACATCTGTGAAGTGGAGGACCAGAACGAGGAGGTCCAAnGCTAGTGnCG 

421 4 4 4 - 4 480 

GTCTATGAATGTAGACACnCACCTCCTCGTCncCTCCTCCACGnAACGATCACAAGC 
DTYICEVEDQKEEVQLLVFG- 

B 

S S 
P T 
M Y 
I I 
GAnGACTCCCAACTCTGACACCCACCTGCTTCAGGGGCAGAGCCTGACCCTGACCnGG 

481 — 4 — 4 4 4 4 540 

CTAACTCACGGnCAGACTGTCGGTGGACGAAGTCCCCGTCTCCGACTGGGACTGGAACC 
LTANSDTHLLQCQSLTLTLE- 
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B BS H 
BS SC D MIS 

AP TR D N N T 

Nl NF E L F Y 

22 11 1 111 

/ / 

AGAGCCCCCCTGGTAGTAGCCCaCACTCCAATCTACGAGTCCAAGCCGTAAAAACATAC 
543 ♦ ♦ ♦ ♦ 600 

TCTCGGCGGGACCATCATCGGGCAGTCACGTTACATCCTCAGGTTCCCCATTTTTGTATG 
SPPGSSPSVqCRSPRGKNI Q- 

N BBH S B 6S 

M MD ASP A BSSGSC S B N SC 

8 ND LPV L APTIAR T A L TR 

0 LE U6U U NINACF X N A NF 

2 11 122 1 221111 1 1 4 11 

// / /// / 
AGGGGGCGAACACCCTCTCCCTCTCTCAGCTCGAGCTCCACGATAGTGGCACCTGGACAT 

601 ♦ ♦ « 4 4 4 660 

TCCCCCCCTTCTGGGAGAGGCACAGAGTCCACCTCGAGGTCCTATCACCGTGGACCTGTA 
CCKTLSVSQLELQOSGTWTC- 

N 

NS M NM A 

LP B HA L 

AH 0 EE U 

31 2 11 1 

/ 

GCACTGTCTTGCAGAACCAGAAGAAGGTGGAGnCAAAAtAGACATCCTGGTGCTAGCTT 

661 4 4 4 4 4 4 720 

CGTCACAGAACGTCnGGTCTTCTTCCACCTCAAGTTTTATCTGTAGCACCACGATCGAA 
TVLQNQKKVEFKIDIVVLAF- 

HS MM 
AT N N 

EU L L 

31 1 1 

/ 

TCCAGAAGGCCTCCAGCATAGTCTATAAGAAAGAGGGGGAACAGGTGGAGnCTCCnCC 

721 ♦ ♦ ♦ ♦ ♦ 780 

AGGTCnCCGGAGGTCGTATCAGATATTCTnCTCCCCCnCTCCACCTCAAGAGGAAGG 
QKASSIVYKKEGEQVEFSFP- 

A AM 

L L N 

U U L 

1 11 

CACTCGCCrrTACAGTTCAAAAGCTGACGGGCAGTGGCGAGCTGTGGTGGCAGGCGCACA 
781 ♦ ♦ ♦ * 840 

GTGAGCGGAAATGTCAACTTTTCCACTCCCCGTCACCCCTCGACACCACCCTCCGCCTCT 

LAFTVEKLTGSGE,LWWQAER- 
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P s 

K M FV A M 
P N LN U B 
H L ML 3 0 
1 1 11 A 2 

CGGCnCCTCCTCCAAGTCnCCATCACCTTTGACCTCAAGAACAACGAAGTGTCTGTAA 
841 * ♦ . •» ♦ 4 ♦ 900 

CCCGAACGAGGAGGTTCAGAACCTAGTCCAAACTGGACnCTTCmCTTCACACACATT 

ASSSKSWITFOLKNKEVSVK. 
B BS PS 

SM SCADNPAD A AH 
TA TRVRLUUO L LP 
EE NFAAAM9E U U H 

23 11224161 1 1 1 

¥ / / // 

AACGGGnACCCAGGACCCTAAGCTCCAGATCGGCAAGAAGCTCCCCCTCCACCTCACCC 

901 -» : ♦ ♦ ♦ 960 

TTGCCCAATGC_GTCCTCGGATTCGAGGTCTACCCGnCT7CGAGGGCGACGTCGAGTG5C 

H L T L - 



R 


V T Q 


D P K 


L 


q M G 


K K L P L 




BS 








BSS 


M 


5C HS 


D 


M 


H 


SCAHM 


N 


TR AT 


D 


N 


P 


TRUAN 


L 


NF EU 


E 


L 


H 


NF9EL 


1 


11 31 


1 


1 


1 


11631 




/ / 








/ / 



7GCCCCAGGCCT7GCCTCAGTATGCTGGCTC7GGAAACCTCACCCTGGCCCTTGAAGCGA 

961 - * ♦ ♦ 1020 

ACGGGGTCCGGAACGGAGTCATACGACCGAGACCtTTGGAGTGGCACCGGGAACTTCGCT 

PQALPQYAGSGNLTLALEAK- 

S BS 

F SC H D A 

A TR POL 
N NF H E U 

1 11 111 

AAACAGGAAAGHGCATCAGGAAGTGAACCTGGTGGTGATGAGAGCCACTCAGCTCCAGA 

1021 ♦ ♦ ♦ ♦ 1080 

TTTGTCCTTTCAACGTAGTCCTTCACnGGACCACCACTACTCTCGGTGAGTCGAGCTCT 

TGKLHQEVNL VVMRAT QLQK- 
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DC 

rS 


c 








ADNNPA 


DF 


AM 


DE 


A 


VRLLUU 


DA 


LN 


DS 


L 


AAAAM9 


EN 


UL 


EP 


U 


224416 


11 


11 


11 
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mil 
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M 
N 

L 
1 



1081 * ♦ ♦ ■» ♦ 1140 

TTnAAACTGCACACTCCACACCCCTCGGTCGAGGGGAnCGACTACGACTCGAACnTG 

NLTCEVWCPTSPKLMLSLKL- 

V T H M DM 

N A P N DS 

L Q A L ET 

1 1 2 1 12 

/ 

TGGACAACAAGGAGGCAAAGGTCTCGAAGCGCGAGAAGCCGGTGTGGGTCCTGAACCCTG 

1341 * ♦ ♦ *^ ♦ !200 

ACCTCT7GTTCCTCCG7TrCCAGAGCTTCGCCCTCTTCCGCCACACCCACGACTTCGGAC 

ENKEAKVSKREKPVWVLNPE- 

H PS H 

F D M I A ADPA I 

0 D A N V VRUU N 
K E E F A AAM9 F 

1 13 11 2216 1 

/// 

AGGCGCGGATCTGGCAGTGTCTGCTGAGTGACTCGGCACAGGTCCTGCTGGAATCCAACA 

1201 4— ♦ 4-- ♦ 1260 

TCCGCCCCTACACCGTCACAGACGACTCACTGAGCCCTGTCCAGGACGACCnAGCnGT 

AGM. WQCLLSDSC IlVLLESN I- 
S SA BHF BS B 
ANA HNCP SGNMAANXA SH 
VLU PCRA PIUNM'JLHV " PP 
AA9 AIFL 1ADLH3A0A IH 
236 2111 21211A421 21 

// // / / / / 

TCAAGGTTCTGCCCACATGGTCCACCCCGGTGCACGCGGATCCCGAGGGTGAGTGTGCCC 

1261 ♦ ■» ♦ ♦ 1320 

ACTTCCAAGACGGGTGTACCAGGTGGGGCCACGTGCGCCTAGGGCTCCCACTCACACGGG 

KVLPTWSTPVHADPE 
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BS S S S 

MF SC F DHNA HNC A M M 

AO TR A RALU PCR FAB 

EK NF N AEA9 AIF LEO 

11 11 1 2346 211 3 2 2 

/ III 

TAGAGTACCCTGCATCCACCCACACGCCCCAGCCGCGTCCTGACACGTCCACCTCCA7CT 

1321 o "» ♦ — 

ATCTCATCGGACGTAGGTCCCTCTCCGGCCTCGGCCCACGACTCTGCAGGTCCAGCTAGA 

BS S 

M D M SC M ANA M MS 

N D N TR B VLU B N T 

LE LNF0AA9 0 LY 

11 1 11 2 246 2 11 

CTTCCTCAGCACCTGAACTCClGGGGGCACCGTCAGTCncCTCTTCCCCCCAAAACCCA 

1381 • * * ■» ♦ * l^^C 

G*:AGGA&TCGTGGACT7GAGGACCCCCCTCGCAGTCAGAAGGAGAAGGCGGGTTrrGGGT 

A P E L L C e ^ S V F 1. -F F f K P K - 

S 5S N 

AN M HMANNAC DM M NS M 

UL N PNVCLUR DS A LP A 

3A L ALAIA9F ET E AH E 

A3 1 2121461 12 3 31 2 

AGGACACCCTCATGATCTCCCGGACCCCTGAGGTCACATGCGTGGTGGTGGACGTGAGCC 

1441 * 1500 

TCCTGTGCGAGTACTAGAGGGCCTGGGGACTCCAGTGTACGCACCACCACCTGCACTCGG 

DTLMISRTPEV TCVVVDVSH- 

M DM M RM M. 

N DS B SA N 

L ET 0 AE L 

1 12 2 12 1 

ACGAAGACCCTGAGGTCAAGTTCAACTGGTACGTGGACGGCGTGGAGGTGCATAATGCCA 

1501 ♦ ♦ ♦ * 1560 

TGCTTCTGGGACTCCAGTTCAAGTTGACCATGCACCTGCCGCACCTCCACGTAnACGGT 

EDPEVKFMWY VDGVEVHNAK- 

F FN S 

M N NSS R MR HNC HH 

N U UPA S A S PCR GP 

L 4 DBC A E A AIF AH 

1 H 222 1 2 1 211 11 

// / 
ACACAAAGCCGCGGGAGGAGCAGTACAACAGCACGTACCGCGTGGTCAGCGTCCTCACCG 

1561 . . * * 1620 

TCTGTTTCGGCGCCCTCCTCGTCATGTTCTCGTGCATGCCCCACCAGTCGCAGGAGTGGC 

TKPREEQYNSTYRVVSVLTV- 
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BS 

M SC R 

N TR S 

L NF A 

1 11 1 

TCCTCCACCACGACTGGCTGAATCCCAACGAGTACAAGTCCAAGGTCTCCAACAAAGCCC 

1621 * ♦ — 4--- 4 ♦ 1680 

AGGACGTGGTCCTCACCGACTTACCGnCCTCATGTTCACGnCCAGAGGTTCnTCGGG 

LHQ&WLNCKEYKCKVSNKAL- 

P S S 

V T ADNNPMA A 

N A VRLLUNU U 

L q AAAAML9 9 

1 _l 2244116 6 

' //// / 

TCCCAGCCCCCATCCAGAAAACCATCTCCAAAGCCAAAGGTGGGACCCGTGCGGTGCGAG 

1681 ♦ -» ♦ ♦ ♦ ♦ 1740 

ACGGTCGGCGGTAGCTCTTTTGGTAGAGGTnCGGTTTCCACCCTGGGCACCCCACGCTC 
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GGCCACATGGACAGAGGCCGGCTCGGCCCACCCTCTGCCC7GACAGTCACCGCTGTACCA 

1741 4 4 ♦ 4 4 ♦ 1800 

CCGGTGTACCTGTCTCCGGCCGAGCCGGGTGGGAGACGGGAC7CTCACTGGCGACATGGT 
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ss 
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R F 


AHNNCC 
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V 
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S 0 


VPCCRR 


L 


4 
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V 


A K 


AAIIFF 


1 


H 


1 


1 


1 1 


121111 



//// 

ACCTCTGTCCTACAGGGCAGCCCCGAGAACCACAGGTGTACACCCTGCCCCCATCCCGGG 

1801 4 4 4 4 4 4 I860 

TGGAGACAGGATGTCCCGTCGGCGCTCTTGGTGTCCACATGTGGGACGGGCGTAGGGCCC 
GQPREPQVYTLPPSRD- 
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BS BS 6 

S A F SC SC S 

M L 0 TR TR P 

A U K NF NF M 

1 1 1 11 11 1 

ATGAGCTCACCAAGAACCAGCTCACCCTCACCTCCCTGGTCAAAGGCrrCTATCCCAGCG 

1861 ♦ 4 ♦ * 1920 

TACTCGACTGGnCTTCGTCCAGTCGGACTGGACGGACCAGrnrCCCAAGATACGCTCGC 

ELTKNQVSLTCLVKCFYPSO- 

F 

N H B 
UP B 
4 A V 

H 2 1 

ACA7CCCCCTGGACTGGCAGAGCAATGCGCA0CCCGAGAACAACTACAACACCACGCCTC 
1921 ♦ 1980 

tgtagcgccacctcaccctctcgttacccgtcggcctcngttgatcttctcgtgcgcag 
~ iave*e"sngqpennykttpp- 

B 
S 
P 

w 
1 

CCGTGCTGGACTCCCACGGCTCCnCTTCCTCTACAGCAACCTCACCGTGGACAAGAGCA 

1981 ♦ ♦ 20*0 

GGCACGACCTGAGGCTGCCGAGGAAGAAGGAGATGTCGTTCGAGTGCCACCTGTTCTCGT 

T V 0 K S R - 
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M A 
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N L 


L 
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L U 
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1 1 



L D S 


D G S F 


F L 


Y 


S K L 


F 




S 






NM 


VBX 


NF 


M 


N N 


US 


ASM 


LA 


N 


S L 


40 


EVN 


AN 


L 


I A 


H2 


211 


31 


1 


1 3 



GGTGGCAGCAGGGGAACGTCTTCTCATGCTCCGTCATGCATGAGGCTCTGCACAACCACT 

2041 * ■» ♦ ♦ ♦ 2100 

CCACCGTGGTCCCCTTGCAGAAGAGTACGAGGCACTACGTACTCCGAGACGTGTTGGTGA 

WaQGNVFSCSVMHEALHNHY- 

S 

M M HNC CXH 
B N PCR FMA 
0 L AIF RAE 
2 I 211 133 

/ / 
ACACGCAGAAGAGCCTCTCCCTGTCTCCGGCTAAATCAGTGCGACGGCCG 

2101 ♦ ♦ * ♦ ♦ 2150 

TGIGCGTCnCTCGGAGAGGGACACAGGCCCATTTACTCACGCTGCCGGC 

TQKSLSLSPCK. 
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Example 2: Preparation of the Fusion Proteins from Supematants of COS Cells 

5 

COS cells grown In DME medium supplemented with 10% Calf Serum and gentamicin sulfate at 15 
ug/ml were split into DME medium containing 10% NuSerum (Collaborative Research) and gentamicin to 
give 50% confluence the day before transfection. The next day, CsCI purified plasmid DNA was added to a 
final concentration of 0.1 to 2.0 ug/ml followed by DEAE Dextran to 400 itg/ml and chloroquine to 100 uM. 

10 After 4 hours at 37* C, the medium was aspirated and a 10% solution of dimethyl sulfoxide in phosphate 
buffered saline was added for 2 minutes, aspirated, and replaced with DME/10% Calf Serum. 8 to 24 hours 
later, the cells were trypsinlzed and split 1:2. 

For radiotabeling, the medium was aspirated 40 to 48 hours after transfection. the cells washed once 
with phosphate buffered saline, and DME medium lacking cysteine or methionine was added. 30 minutes 

75 later. ^^S-labeled cysteine and methionine were added to final concentrations of 30-60 uci and 100-200 Rci 
respectively, and the cells allowed to incorporate label for 8 to 24 more hours. The supernatants were 
recovered and examined by electrophoresis on 7.5% polyacrylamide gels following denaturation and 
reduction, or on 5% polyacrylamide following denaturation without reduction. The CD4B7I protein gave the 
same molecular mass with or without reduction, while the CD4E7I and CD4H7I fusion proteins showed 

?o mclecuiar masses without reduction of twice the mass observed with reduction, indicating that ti'wy formeu 
dimer structures. The CD4 IgM fusion proteins formed large multimers beyond the resolution of the gel 
system without reduction, and monomers of the expected molecular mass with reduction. 

Unlabeled proteins were prepared by allowing the cells to grow for 5 to 10 days post transfection in 
DME medium containing 5% NuSerum and gentamicin as above. The supernatants were harvested, 

25 centrifuged, and purified by batch adsorption to either protein A trisacryl, protein A agarose, goat anti- 
human IgQ antibody agarose, rabbit anti-human IgM antibody agarose, or monoclonal anti-CD4 antibody 
agarose. Antibody agarose conjugates were prepared by coupling purified antibodies to cyanogen bromide 
activated agarose according to the manufacturer's recommendations, and using an antibody concentration 
of 1 mg/ml. Following batch adsorption by shelving overnight on a rotary table, the beads were harvested by 

30 pouring into a sintered glass funnel and washed a few times on the funnel with phosphate buffered saline 
containing 1% Nonidet P40 detergent The beads were removed from the funnel and poured into a small 
disposable plastic column (Quik-Sep QS-Q column, Isolab), washed with at least 20 column volumes of 
phosphate buffered saline containing 1% Nonidet P40, with 5 volumes of 0.15 N NaCI, 1 mM EDTA (pH 
8.0), and eluted by the addition of either 0.1 M acetic acid, 0.1 M acetic acid containing 0.1 M NaCI, or 0.25 

55 M glycine-HCI buffer, pH 2.5. 



Example 3: Blockage' of Syncytium Formation by the Fusion Proteins 

40 Purified or partially purified fusion proteins were added to I4PB-ALL cells infected 12 hours previously 
with a vaccinia virus recombinant encoding HIV envelope protein. After incubation for 6-8 more hours, the 
cells were washed with phosphate buffered saline, fixed with formaldehyde, and photographed. All of the 
full-length CD4 immunoglobulin fusion proteins showed inhibition of syncytium formation at a concentration 
of 20 ug/ml with the exception of the 4IH7I protein, which was tested only at 5 ug/ml and showed partial 

45 inhibition of syncytium formation under the same conditions. 



Example 4: Chromium Release Cytolysis Assay 

50 The purified fusion proteins were examined for ability to fix complement in a chromium release assay 
using vaccinia virus infected cells as a model system. Namalwa (B cell) or HPB-ALL (T cell) lines were 
infected with vaccinia virus encoding HIV envelope protein, and 18 hours later were radiolabeled by 
incubation in 1 mci/ml sodium ^^chromate in phosphate buffered saline for 1 hour at 37*. The labeled ceils 
were centrifuged to remove the unincorporated chromate, and incubated in microtiter wells with serial 

55 dilutions of the CD4 immunoglobulin fusion proteins and rabbit complement at a final concentration of 40%. 
After 1 hour at 37* , the cells were mixed well, centrifuged, and the supematants counted In a gamma-ray 
counter. No specific release could be convincingly documented. 
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Example 5: Binding of tiie CD4E7I Protein to Fc Receptors 



Purified CO4E7I fusion protein was tested for its ability to displace radiolabeled human lgG1 from 
human Fc receptors expressed on COS cells in culture. The IgGI was radiolabeled with sodium ^^iodide 

s using 1 mci of iodide. 100 ug of lgG1, and two idobeads (Pierce). The labeled protein was separated from 
unincorporated counts by passage over a Sephadex 025 column equilibrated with phosphate buffered 
saline containing 0.5 mlVI EDTA and 5% nonfat milk. Serial dilutions of the CO4E7I fusion protein or 
unlabeled IgGI were prepared and mixed with a constant amount of radiolabeled IgGI tracer. After 
Incubation with COS cells bearing the FcRI and RcRII receptors at 4* C for at least 45 minutes in a volume 

70 of 20 m, 200 ul of a 3:2 mixture of dibutyl to dioctyl phthaiates were added, and the cells separated from 
the unbound label by centrlfugation in a microcentrifuge for 15 to 30 seconds. The tubes were cut with 
scissors, and the cell pellets counted in a gamma-ray counter. The affinity of the CD4E7I protein for 
receptors was measured In parallel with the affinity of the authentic IgGI protein, and was found to be the 
same, within experimental error. 

75 

Example 6: Stable Expression of the Fusion Construct PCD4E7I in Baby Hamster Kidney Cells 

Twenty-four hours before transfection. 0.5 x 10^ baby hamster kidney cells (BHK; ATCC CCL10) were 
20 seeded in a 25 cm^ culture flask in Oulbecco's modified Eagle's medium (OMEN) containing 10% of fetal 
calf serum (FCS). The cells were cotransfected with a mixture of the plasmids DCD4E7l„(20.ug VgSVadhfr 
(5 iig; Lee et ah, Nature 294:228-232 (1981)) and pRMH140 (5 ug, Hudziak et a!., Cell 31^:137-146 (1982)) 
according to a modified calcium phosphate transfection technique as described in Zettimeissl et ai. (Behring 
Inst. Res. Comm. 82:26-34 (1988)). 72 h post-transfectlon. ceils were split 1:3 to 1:4 (60 mm culture dishes) 
25 and resistant colonies were selected in DMEM medium containing 10% FCS. 400 ixg/ml G418 (Geneticin, 
GIbco) and 1 uM methotrexate (selection medium). The medium was changed twice a w;eek. The resistant 
colonies (40-100/transfection) appeared 10-15 day post-transfectlon and were further propagated either as a 
mixture of clones (i.e.. BHK-NKI) or as individually isolated clones. For the determination of the relative 
expression levels, clone mixtures or individual clones were grown to confluency in T25 culture flasks, 
30 washed twice with protein-free DMEM medium, and incubated for 24 h with 5 ml protein-free DMEM 
medium. These media were collected and subjected to a human IgG specific ELISA in order to determine 
the relative expression levels of the CD4-lgG1 fusion protein CD4E7I. For further analysis an individual 
clone (BHK-UC3) was chosen due to its high relative expression levels. 

35 

Example 7: Detection of the CD4E7I Protein in Culture Supernatants 

For methionine labeling of cells, the clone BHK-UC3 and untransfected BHK cells (control) were 
grown to confluency in T25 culture flasks and subsequently incubated for two hours in HamF12 medium 

40 without methionine. Labeling was achieved by Incubating 24 h in 2.5 ml of the same medium containing 100 
uCi methionine (1070 Cl/mmole. Amersham). For the preparation of cell lysates, the labeled cells were 
harvested in 1 ml of phosphate buffered saline. pH 7.2 (PBS) and lysed by repetitive freezing and thawing. 
Cleared lysates (after centrlfugation 20000 rpm, 20 min) and culture supernatants were incubated with 
Protein A-Sepharose (Pharmacia) and the bound material was analyzed on a 10% SDS-Protein A-Sepharose 

46 (Pharmacia) and the bound material was analyzed on a 10% SDS-gel according to Laemmll (Nature 
227:680-685 (1970)). which was subsequently autoradiographed. A specific band of about 80 KDa can be 
detected only in the supernatant of clone BHK-UC3. which is absent in the iysate of clone BHK-UC3 and in 
the respective controls. 

60 ' 

Example 8: Purification of the Protein CD4E7I from Culture Supernatants 

In order to demonstrate that the fusion protein coded by the plasmid PCD4E7I can be obtained in high 
quantities, the clone BHK-UC3 was grown in 1750 cm^ roller bottles in selection medium (500 ml). Confluent 
65 monolayers were washed twice with protein-free DMEM medium (200 ml) and further incubated for 48 h 
with protein-free DMEM medium (500 ml). The conditioned culture supernatants (1-2 I) and respective 
supernatants from untransfected BHK cells were cleared by centrifugation (9000 rpm. 30 min) and 
microfiltered through a 0.45 urn membrane (Nalgene). After addition of 1% (v/v) of 1.9 M Tris-HCl buffer, 
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pH 8.6, the condrtioned medium was absorbed to a Protein A-Sepharose column equilibrated with 50 mM 
Tris-HCI pH 8.6 buffer containing 150 mN NaCI (4*C). The loaded column was washed with 10 column 
volumes of equilibration buffer. Elution of the CD4-lgQl fusion protein CD4E7I was achieved with 0.1 M 
sodium citrate buffer, pH 3, followed by immediate neutralization of the column efflux to pH 8 by Tris-base. 

5 The peak fractions were pooled, and the pool was analyzed on a Coomassie blue stained SDS-gel resulting 
in a band of the expected size (80 KDa). and which reacted with a polyclonal anti-human IgQ heavy chain 
antibody and a mouse monoclonal anti-CD4 anti body (BMA040, Behringwerke) in Western Blots. The 
yields of purified fusion proteins obtained by the given procedure is 5-18 mg/24 h/l culture supernatant. The 
respective value for a BHK clone mixture (about 80 resistant clones; BHK-NK1) as described above was 2-3 

10 mg/24 h/l. 



Example 9: Physical and Biological Characterization of the CD4E7I Fusion Protein 

15 As proven by SDS-electrophoresis on 10-15% gradient gels (Phast-System. Pharmacia) under non- 
reductive conditions, the CD4Ey1 fusion protein migrates at the position of a homodimer (about 160 KDa) 
like a non-reduced mouse monoclonal antibody, this result is supported by analytical equilibrium uitracen- 
trifugation, where the fusion protein behaves as a homogeneous dimeric molecule of about 150 KDa The 
absorbance coefficient of the protein was determined as A28o = 18 cm^/mg using the quantitative protein 

20 determination according to Bradford ( Anal. Biechem. ^:248* 2S4 (1 976)). 

The CD4E7I -fusion protein shows specific complex formation with a solubilized j8gal-gp120 fusion 
protein (pMB1790; Broker et aL, Behring Inst Res. Commun. ^:338-348 (1988)) expressed In E. coll. In this 
protein (110 KDa), a major part of the HIV gp120 protein (VaU9-Trp64s) is fused to fl-galactosidase (amino 
acids 1-375). In a control experiment a 67-KDa j8-gal-HIV 3'orf fusion protein ()5gal1-375: 3'orf Pro14- 

25 Asp123) showed no complex formation. En these experiments, the CD4E7I -protein was incubated with the 
respective fusion protein in molar rations of about 5:1. The complex was isolated by binding to Protein A- 
Sepharose and the Protein A-Sepharose bound proteins-together with relevant controls-were analyzed on 
10-15% gradient SDS-gels (Phast-System. Pharmacia). 

The CD4E7I fusion protein binds to the surface of HIV (HIV1/HTLV-IIIB) Infected cultured T4- 

30 lymphocytes as determined by direct immunofluorescence with fluorescein-isothiocyanate (FITC) labeled 
CD4E7I protein. It blocks syncytia formation In cultured T4-lymphocytes Upon HIV infection (025 TCID/cell) 
at a concentration of 10 ug/ml. Furthermore, HIV-infected cultured T4-lymphocytes (subclone of cell line 
H9) are selectively killed upon incubation with CD4E7I in the presence or absence of complement To a 
highly ^50%) HIV infected culture of T4-lymphocytes (10^ cells/ml) 50, 10 or 1 ug/ml CD4E7I fusion 

35 protein was added in the presence or absence of guinea pig complement. Cells were observed for specific 
killing by the fusion protein, which is defined by the percentage of killed cells after 3 days in relation to 
viable cells in the culture at the beginning of the experiment connected by the values for unspecific killing 
observed in control cultures, lacking the CD4E7I fusion protein (Table 5. Experiment I). Surprisingly, 
addition of CD4E7I protein to the infected T4 ceils in the absence of complement resulted in similar 

40 specific killing rates as In the presence of complement (Table 5, Experiment II). This result demonstrates a 
complement independent cytolytic effect of CD4Ey1 on HIV infected T-lymphocytes in culture. 

Table 5 



No. 


Assay System 


Specific 


Experiment 




Killing (%) 


1 


non-infected T4-cells + 50 ug/ml CD4E7i + Compl. 


0.7 




infected T4-cells + 50 ug/ml CD4E7I + Compl. 


35.1 




infected T4-ceils + 10 ug/ml CD4E7I + Compl. 


25.1 




infected T4-cells + 1 ug/ml CD4E7I + Compl. 


25 


II 


infected T4-cel!s + 10 ug/ml CD4E7I + Compl. 


49.9 




infected T4-cells + 10 ug/ml CD4E7I + Compl. 


69.4 



Having now fully described this invention, it will be appreciated by those skilled In the art that the same 
can be performed with any wide range of equivalent parameters of composition, conditions, and methods of 
preparing such fusion proteins without departing from the spirit or scope of the invention or any 
embodiment thereof. 
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Claims 



1. A fusion protein gene comprising 1) the DNA sequence of CD4, or fragment thereof which binds to 
HIV gp120, and 2) the DNA sequence of an immunoglobulin heavy chain, characterized in that the DNA 
sequence which encodes the variable region of said immunoglobulin chain has been replaced with the DMA 
sequence which encodes C04. or said gp120 binding fragment thereof. 

2. The fusion protein gene of claim 1, wherein the ONA sequence which encodes said fragment of CD4 
comprises the following ONA sequence: 



CAATCAACCGCG 

-+ 120 

GTTACTTGGCCC 



GAGTCCCTTTTAGCCACTTCCTTCTGGTGCTGCAACTGGCGCTCCTCCCACCAGCCACTC 
CTCAGGGAAAATCCGTGAACGAAGACCACCACCTTGACCCCCACGAGGGTCGTCCGTGAG 
AGGCHAAGAAAGTGGTGCTGGGCAJE^AAAGG^ 

TCCCTTTCTTTCACCACCACCCGTTTTTTCCCCTATGTCACCTTGACTGGACATGTCGAA 



CCCAGAAGAAGAGCATACAATTCCACTGGAAAAACTCCAACCAGATAAAGATTCTGGGAA 

241 + + + . 

* +- + 300 

GGGTCTTCTTCTCGTATGTTAAGGTCACCTTTTTGAGGTTGCTCTATTTCTAAGACCCTT 



ATCAGGGCTCCTTCTTAACTAAACCTCCATCCAAGCTGAATCATCGCGCTGACTCAAGAA 
TAGTCCCGAGGAAGAATTGATTTCCAGGTAGGTTCGACTTACTAGCGCGACTGAGTTCTT 
GAAGCCTTTGGGACCAAGCAAACTTCCCCCTGATCATCAAGAATCTTAAGATAGAAGACT 
CTTCGGAAACCCTGGTTCCXTTGAAGGGGGACTAGTAGTTCTTAGAATTCTATCTTCTCA 



cacatacttacatctgtgaagtggaggaccacaagcAggaggtccaattgctagtgttcg 

421 + + + ♦ 480 

ctctatgaatgtagacacttcacctcctggtcttcctcctccacgttaacgatcacaagc 
gattgactgccaactctgacacccacctgcttc 

481 * +-.-- 

ctaactgacgcttgagactgtcggtggacgaag 



or a degenerate variant thereof, or the foilowing DNA sequence: 
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CAATGAACCGGG 

-+ 120 

GTTACTTGGCCC 



CAGTCCCTTTTACGCACTTGCTTCTGCTCCTGCAACTCGCCCTCCTCCCAGCACCCACTC 

121 + 4-.. ♦ 180 

CTCAGGCAAAATCCGTGAACGAAGACCACGACCTTCACCCCGAGGACGGTCGTCCGTCAO 

AGGGAAAGAAAGTGGTGCTGGGCAAAAAACGCCATACAGTGGAACTGACCTGTACAGCTT 

181 + ♦ + + + 240 

TCCCTTTCTTTCACCACCACCCGTTTTTTCCCCTATGTCACCTTGACTGGACATGTCCAA 

CCCAGAAGAAGAGCATACAATTCCACTGGAAAAACTCCAACCAGATAAAGATTCTGGGAA 

241 + +- + + + 300 

CGCTCTTCTTCTCGTMrGTTAAGGTCACCTTTTTGAGGTTGGTCTATTTCTAAGACCCTT 

ATCAGGGCTCCTTCTTAACTAAAGGTCCATCCAAGCTGAATGATCCCGCTGACTCAAGAA 

301 360 

TAGTCCCGAGGAAGAATTGATTTCCAGGTAGGTTCGACTTACTAGCGCGACTGAGTTCTT 

GAAGCCTTTGGGACCAAGGAAACTTCCCCCTGATCATCAAGAATCTTAAGATAGAAGACT 

361 +- ♦ + + + 420 

CTTCGGAAACCCTGGTTCCTTTGAAGGGGGACTAGTAGTTCTTAGAATTCTATCTTCTGA 

CAGAXACTTACATCTGTGAAGTGGAGGACCAGAAGGAGGACGTGCAATTGCTAGTGTTCO 

421 + + + + + 480 

GTCTATGAATGTAGACACTTCACCTCCTGGTCTTCCTCCTCCACGTTAACGATCACAAGC 



GATTGACTGCCAACXCTGACACCCACCTGCTTCAGGGGCAGAGCCTCACCCTGACCTTGG 
481 + + + + 540 

CTAACTGACCGTTGAGACTGTGGGXGGACGAAGTCCCCGTCTCGGACTGGGACTGGAACC 

AGAGCCCCCCTGGTAGTAGCCCCTCAGTGCAATGTAGGAGTCCAAGGGGTAAAAACATAC 
541 + -f 4. 600 

TCTCGGGGGGACCATCATCGGGGAGTCACGTTACATCCXCACGTTCCCCATTTTTGTATG 



AGGGGGGGAAGACCCXCXCCGXCXCXCAG 

601 ♦ ^ 

XCCCCCCCXXCXGGGAGAGGCACAGAGXC 



or a degenerate variant thereof. 

3. The fusion protein gene of claim 1 or 2, characterized in that said immunoglobulin chain is of the 
class IgM, IgGi or lgQ3. 
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4. A fusion protein gene comprising 1) the DNA sequence of CD4. or fragment thereof which binds to 
HIV gp120, and 2) the DNA sequence of an immunoglobulin light chain, characterized in that the DNA 
sequence which encodes the variable region of said immunoglobuiin light chain has been replaced with the 
DNA sequence which encodes CD4, or HIV gp120-binding fragment thereof. 
5 5. A fusion protein gene of claim 4, characterized in that the DNA sequence which encodes said 
fragment of CD4 comprises the following DNA sequence: 

CAATGAACCGGG 
120 

10 

GTTACTTGGCCC 

CAGTCCCTTTTAGGCACTTGCTTCTGCTCCTCCAACTGGCGCTCCTCCCAGCAGCCACTC 

121 + + + + i- 180 

CTCAGGGAAAATCCGTGAACCAAGACCACCACCTTCACCGCCAGGAGGGTCGTCGGTGAG 



20 

AGGGAAAGAAACTGGTGCTGGGCAAAAAAGGGGATACAGTGGAACTGACCTGTACAGCTTL, 

IBl + + + + ♦ 240 

TCCCTTTCTTTCACCACGACCCGTTTTTTCCCCTATGTCACCTTGACTCGACATGTCCAA 

25 

CCCAGAAGAAGAGCATACAATTCCACTGGAAAAACTCCAACCAGATAAAGATTCTGPGAA 

241 + + + + 300 

CGGTCTTCTTCTCGTATGTTAAGGTGACCTTTTTGAGGTTGGTCTATTTCTAAGACCCTT 

30 

ATCAGGGCTCCTTCTTAACTAAAGGTCCATCCAAGCTGAATGATCGCGCTGACTCAAGAA 

301 + ♦ — + + ♦ 360 

TAGTCCCGAGGAAGAATTCATTTCCACGTAGGTTCGACTTACTAGCGCGACTGAGTTCTT 

35 

GAAGCCTTTGGGACCAAGGAAACTTCCCCCTGATCATCAAGAATCTTAAGATAGAAGACT 

361 + + + + ♦ 420 

CTTCGGAAACCCTGGTTCCTTTGAAGGGGGACTAGTAGTTCTTAGAATTCTATCTTCTGA 

CACATACTTACATCTGTGAAGTCCAGGACCAGAACCACGAGGTGCAATTGCTAGTGTTCG 

421 + + + + + 480 

GTCTATGAATGTACACACTTCACCTCCTGGTCTTCCTCCTCCACGTTAACGATCACAAGC 

^ GATTGACTGCCAACTCTGACACCCACCTCCTTC 

481 ♦ + 

CTAACTGACCCTTCAGACTGTGGGTGGACCAAG 

^ or a degenerate variant thereof, or the following DNA sequence: 
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CAATGAACCGGG 

-+ 120 

GTTACTTGGCCC 

GAGTCCCTTTTAGGCACTTGCTTCTGGTGCTGCAACTGCCGCTCCTCCCAGCAGCCACTC 

121 + .•-+ .+ +--... 180 

CTCAGGGAAAATCCGTGAACGAAGACCACGACGTTGACCGCGAGGAGGGTCGTCGGTGAG 

AGGGAAAGAAAGTGGTGCTGGGCAAAAAAGGGGATACAGTGGAACTGACCTGTACAGCTT 
181 ♦ ..--+ + 4 240 

TCCCTTTCTTTCACCACGACCCGTTTTTTCCCCTATGTCACCTTGACTGGACATGTCCAA 



CCCAGAAGAACAGCATACAATTCCACTGGAAAAACTCCAACCAGATAAAGATTCTGGGAA 

241 + + — .4.-77; + •-•.-+ 300 

GCCTCTTCTTCTCGTATGTTAAGGTGACCTTTTTGAGGTTGGTCTATTTCTAAGACCCTT 

ATCAGGGCTCCTTCTTAACTAAAGGTCCATCCAAGCTGAATGATCGCCCTGACTCAAGAA 

301 --+ + +- ♦ 360 

TAGTCCCGAGGAAGAATTGATTTCCAGCTAGGTTCGACTTACTAGCGCGACTGAGTTCTT 

GAAGCCTTTGGGACCAAGGAAACTTCCCCCTGATCATCAAGAATCTTAAGATAGAAGACT 

361 + ♦ ♦ ----+ 420 

CTTCGGAAACCCTGGTTCCTTTGAAGGGGGACTAGTAGTTCTTAGAATTCTATCTTCTGA 

CAGATACTTACATCTGTGAAGTGGACGACCAGAAGGAGGAGGTGCAATTGCTAGTGTTCG 

421 + + + + + 480 

GTCTATGAATGTAGACACTTCACCTCCTGCTCTTCCTCCTCCACGTTAACGATCACAAGC 

GATTGACTCCCAACTCTGACACCCACCTGCTTCAGGGGCAGAGCCTCACCCTGACCTTGG 

481 - + + + + 540 

CTAACTGACGGTTGAGACTCTGGGTCGACGAAGTCCCCGTCTCCGACTGGGACTGGAACC 

AGAGCCCCCCTGGTAGTACCCCCTCAGTGCAATGTAGGAGTCCAAGGGCTAAAAACATAC 

541 ♦ +. 600 

TCTCGGGGCGACCATCATCGGGGACTCACGTTACATCCTCAGGTTCCCCATTTTTGTATG 

AGGGGGGGAAGACCCTCTCCGXGTCTCAG 

601 + 

TCCCCCCCTTCTGGGAGAGCCACAGAGTC 

or a degenerate variant thereof. 
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6. A vector comprising the fusion protein gene of claim 1 , preferably having the identifying characteris- 
tics of pCD4Hy1, which has been deposited under Accession No. 67611, or pCD4l\/lu, which has been 
deposited under Accession No. 67608, or of pCD4Pu, which has been deposited under Accession No. 
67609, or of pCEME^I. which has been deposited under Accession No. 67610. all in E. coil at the ATCC 

5 under the terms of the Budapest Treaty. 

7. A vector comprising the fusion protein gene of claim 4. 

8. A host transfomned with the vector of claim 6 or 7. 

9. The host of claim 8 which expresses an immunoglobulin light chain together with the expression 
product of said fusion protein gene to give an immunoglobulin-lil<e molecule which binds to gp120 or an 

ro immunoglobulin heavy chain together with the expression product of said fusion protein gene to give an 
immunoglobulin-lilce molecule which binds to l-IIV or SIV gp120. 

10. The host of claim 9, wherein said immunoglobulin heavy chain is of the immunoglobulin class IgM, 
IgGi or lgQ3. 

11. A method of producing a fusion protein comprising CD4, or fragment thereof which binds to gp120, 
IS and immunoglobulin heavy chain, wherein the variable region of the immunoglobulin chain has been 

substituted with CD4, or fragment thereof which binds to HIV or SIV gp120, characterized by cultivating in a 
nutrient medium under protein-producing conditions, a host strain transformed with the vector of claim 6. 
said vector further comprising expression signals which are recognized by said host strain and direct 
expression of said fusion protein, and recovering the fusion protein so produced. 
20 12. The method of claim 11, wherein said host strain is a myeloma cell line which produces 
immunoglobulin light chains- and said fusion protein comprises an immunoglobulin heavy chain of the class 
IgM, IgGi or Ig63. wherein an immunoglobulin-like molecule comprising said fusion protein is produced. 

13. A method of producing a fusion protein comprising CD4, or fragment thereof which binds to gp120, 
and an immunoglobulin light chain, wherein the variable region of the immunoglobulin chain has been 

25 substituted with CD4, or fragment thereof which binds to HIV or SIV gp120. characterized by cultivating in a 
nutrient medium under protein-producing conditions, a host strain transformed with the, vector of. claim 7, 
said vector further comprising expression signals which are recognized by said host strain and direct 
expression of said fusion protein, and recovering the fusion protein so produced. 

14. The method of claim 13, wherein said host produces immuno-globulin heavy chains of the class 
30 IgM, IgGi and lgG3 together with said fusion protein to give an immunoglobulin-lilce molecule which binds 

toHIV-gp120. 

15. A fusion protein, which is preferably detectably labeled, comprising CD4, or fragment thereof which 
is capable of binding to HIV or SIV gp120, fused at the C-termlnus to a second protein which comprises an 
immunoglobulin heavy chain of the class IgM, IgGi or lgG3, wherein the variable region of said heavy chain 

35 immunoglobulin has been replaced with C04, or HIV gp120-binding fragment thereof, and preferably further 
comprising a therapeutic agent, radioiabel or NMR imaging agent linked to said fusion protein. 

16. The fusion proteins CD4H7I, CD4Mu, CD4Pu. CD4Ey1 or CD4B7I, 

17. An immunoglobuiin-like molecule, comprising the fusion protein of claim 15 and an immunoglobulin 
light chain, preferably further comprising a detectable label, and especially further comprising a therapeutic 

40 agent radioiabel or NMR imaging agent linked to said immunoglobulin-like molecule. 

18. A fusion protein comprising CD4. or fragment thereof which binds to HIV gp120, fused at the C- 
tenminus to a second protein comprising an Immunoglobulin light chain where the variable region has been 
deleted, and which fusion protein preferably is detectably labeled, especially further comprising a therapeu- 
tic agent, radioiabel or NMR imaging agent linked to said fusion protein. 

46 19. The fusion protein of claim 15. wherein said CD4 fragment comprises the following amino acid 
sequence: 
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20. An immunogiobulin-like molecule comprising the fusion protein of claim 18 and an immunoglobulin 
heavy chain of the class IgM, IgGI or lgG3, preferably further comprising a detectable label, and especially 
further comprising a therapeutic agent, radiolabel or NMR imaging agent linked to said Immunoglobulln-like 
molecule. 

21. A complex comprising the fusion protein of claim 15 or 18 and HIV or SIV gp120. 

22. The complex of claim 21, wherein said gp120 is a part of an HIV or SiV. is expressed on the 
surface of an HIV or SIV-infected cell or is present in solution. 

23. A method for the detection of HIV or SIV gp120 in a sample, characterized by 

(a) contacting a sample suspected of containing HIV or SIV gp120 with the fusion protein of claim 15 
or 18. and 

(b) detecting whether a complex is formed, said fusion protein preferably being detectably labeled. 



Claims for the following Contracting State: QR 

1. A vector comprising a fusion protein gene comprising 1) the DNA sequence of CD4. or fragment 
thereof which binds to HIV gpl20, and 2) the DNA sequence of an immunoglobulin heavy chain, 
characterized In that the DNA sequence which encodes the variable region of said Immunoglobulin chain 
has been replaced with the DNA sequence which encodes CD4, or said gp120 binding fragment thereof. 

2. The vector of claim 1, having the identifying characteristics of PCD4H7I. which has been deposited 
in E. coli at the ATCC under the terms of the Budapest Treaty under Accession No, 67611. 

3. The vector of claim 1 . having the identifying characteristics of pCD4IVIu, which has been deposited in 
E. coli at the ATCC under the terms of this Budapest Treaty under Accession No. 67608, 

4. The vector of claim 1 , having the Identifying characteristics of PCD4Pu. which has been deposited in 
i. coli at the ATCC under the Budapest Treaty under Accession No. 67609. 
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5. The vector of claim 1, having the identifying characteristics of PC4E7I, which has been deposited in 
E. coli at the ATCC under the terms of the Budapest Treaty under Accession No. 67610. 

6. A vector comprising a fusion protein gene characterized by 1) the DNA sequence of CD4. or 
fragment thereof which binds to HIV gp120, and 2) the DNA sequence of an immunoglobulin light chain, 

6 wherein the ONA sequence which encodes the variable region of said immunoglobulin light chain has been 
replaced with the DNA sequence which encodes CD4, or HIV gp120-binding fragment thereof. 

7. A host transformed with the vector of claim 1 . 

8. The host of claim 7 which expresses an immunoglobulin light chain together with the expression 
product of said fusion protein gene to give an immunoglobulin-like molecule which binds to gp120. 

70 9. A host transformed with the vector of claim 8. 

10. The host of claim 6 which expresses an immunoglobulin heavy chain together with the expression 
product of said fusion protein gene to give an immunoglobulin-like molecule which binds to HIV or SIV 
gp120. 

11. The host of claim 10, characterized in that said immunoglobulin heavy chain is of the im- 
15 munoglobulin class \gM, lgQ1 or lgG3. 

12. A method of producing a fusion protein comprising CD4. or fragment thereof which binds to gp120, 
and an immunoglobulin heavy chain, wherein the variable region of the immunoglobulin chain has been 
substituted with CD4, or fragment thereof which binds to HIV or SIV gp120, characterized by cultivating in a 
nutrient medium under protein-producing conditions, a host strain transformed with the vector of claim 1 , 

20 said vector further comprising expression signals which are recognized by said host strain and direct 
. , expression of^said fusion protein, and recovering the fusion protein so produced. 

13. The method of claim 12, characterized in that said host strain is a myeloma cell line which produces 
immunoglobulin light chains and said fusion protein comprises an immunoglobulin heavy chain of the class 
IgM, IgGI or IgGS, wherein an immunoglobulin-like molecule comprising said fusion protein Is produced, 

25 14. A method of producing a fusion protein comprising CD4, or fragment thereof which binds to gp120, 
and an immunoglobulin light chain, wherein the variable region of the immunoglobulin chain has been 
substituted with CD4, or fragment thereof which binds to HIV or SIV gp120, characterized by cultivating in a 
nutrient medium under protein-producing conditions, a host strain transformed with the vector of claim 6, 
said vector further comprising expression signals which are recognized by said host strain and direct 

30 expression of said fusion protein, and recovering the fusion protein so produced. 

15. The method of claim 14, characterized in that said host produces immuno-globulin heavy chains of 
the class IgM, IgGI and IgGS together with said fusion protein to give an immunoglobulin-like molecule 
which binds to HIV-gp120. 

16. A method for the detection of HIV or SIV gp120 in a sample, characterized by 

35 (a) contacting a sample suspected of containing HIV or SIV gp120 with a fusion protein comprising 

CD4, or fragment thereof which binds to HIV gp120, and 2) an immunoglobulin heavy chain, wherein the 
variable region of said immunoglobulin chain has been replaced with CD4. or said gp120 binding fragment 
thereof, and 

(b) detecting whether a complex Is formed. 

40 

17. The method of claim 16, characterized in that said fusion protein is detectably labeled. 

18. A method for the detection of HIV or SIV gp120 in a sample, characterized by 

(a) contacting a sample suspected of containing HIV or SIV gp120 with a fusion protein comprising 
comprising 1) CD4, or fragment thereof which binds to HIV gp120, and 2) an immunoglobulin light chain, 

45 wherein the variable region of said immunoglobulin light chain has been replaced with CD4, or HIV gp120- 
binding fragment thereof, and 

(b) detecting whether a complex has formed. 

19. The method of claim 18, characterized in that said fusion protein is detectably labeled. 

50 

Claims for the following Contracting State: ES 

1. A method of producing a fusion protein comprising CD4, or fragment thereof which binds to gp120, 
55 and an immunoglobulin heavy chain, wherein the variable region of the immunoglobulin chain has been 
substituted with CD4, or fragment thereof which binds to HIV or SIV gp120, characterized by cultivating in a 
nutrient medium under protein-producing conditions, a host strain transformed with a vector comprising a 
fusion protein gene comprising 1) the DNA sequence of CD4, or fragment thereof which binds to HIV 
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gp120, and 2) the DNA sequence of an immunoglobulin heavy chain, wherein the ONA sequence which 
encodes the variable region of said immunoglobulin chain has been replaced with the DNA sequence which 
encodes C04, or said gp120 binding fragment thereof, said vector further comprising expression signals 
which are recognized by said host strain and direct expression of said fusion protein, and recovering the 
5 fusion protein so produced. 

2. The method of claim 1, characterized in that said vector has the identifying characteristics of 
PCD4H7I, which has been deposited in E. coil at the ATCC under the tennns of the Budapest Treaty under 
Accession No. 67611. 

3. The method of claim 1, characterized in that said vector has the Identifying characteristics of 
70 PC04MU. which has been deposited in E. coll at the ATCC under the terms of this Budapest Treaty under 

Accession No. 67608. 

4. The method of claim 1, characterized in that said vector has the identifying characteristics of 
PCD4PU, which has been deposited in E. coll at the ATCC under the Budapest Treaty under Accession No. 
67609. 

IS 5. The method of claim 1, characterized in that said vector has the identifying characteristics of 
PCD4E7I, which has been deposited in E. coll at the ATCC under the terms of the Budapest Treaty under 
Accession No. 67610. 

6. The method of claim 1 , characterized in that said host strain is a myeloma cell line which produces 
immunoglobulin light chains and said fusion protein comprises an immunoglobulin heavy chain of the class 

20 IgM, IgGi or lgG3. wherein an immunogtobulin-like molecule comprising saiu fusloir protein is produceCi: ' 

7. A method of producing a fusion protein comprising C04, or fragment thereof which binds to gp120, 
and an immunoglobulin light chain, wherein the variable region of the immunoglobulin chain has been 
substituted with CD4, or fragment thereof which binds to HIV or SIV gp120, characterized by cultivating in a 
nutrient medium under protein-producing conditions, a host strain transfonmed with a vector comprising a 

25 fusion protein gene comprising 1) the DNA sequence of CD4, or fragment thereof which binds to HIV 
gp120, and 2) the DNA sequence of an immunoglobulin light chain, wherein the DNA sequence which 
encodes the variable region of said immunoglobulin light chain has been replaced with the DNA sequence 
which encodes CD4, or HIV gp120-binding fragment tfiereof, said vector further comprising expression 
signals which are recognized by said host strain and direct expression of said fusion protein, and recovering 

30 the fusion protein so produced. 

8. The method any one of claims 1 or 7, characterized in that the DNA sequence which encodes said 
fragment of CD4 comprises the following DNA sequence: 
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CAATGAACCGGG 

-+ 120 

GTTACTTGGCCC 



GAGTCCCTTTTAGGCACTTGCTTCTGGTGCTGCAACTGGCGCTCCTCCCAGCAGCCACTC 

121 + + + + + 180 

10 CTCAGGGAAAATCCGTGAACGAACACCACGACGTTGACCCCGAGGAGGCTCCTCGCTGAG 

AGGGAAAGAAAGTGGTGCTGGCCAAAAAAGGGGATACAGTGGAACTGACCTGTACAGCTT 
181 + + . + ^. 240 

76 TCCCTTTCTTTCACCACGACCCGTTTTTTCCCCTATGTCACCTTGACTGGACATGTCGAA 

CCCAGAAGAAGACCATACAATTCCACTGGAAAAACTCCAACCAGATAAAGATTCTGCGAA 

241 + + + ♦ 4- 300 

20 GGGTCTTCTTCTCCTATGTTAAGGTGACCTTTTTGAGGTTCCTCTATTTCTAAGACCCTT 

ATCACGGCTCCTTCTTAACTAAAGGTCCATCCAAGCTGAATGATCGCGCTGACTCAAGAA 

301 + + + 360 

25 TAGTCCCGAGGAAGAATTGATTTCCAGGTAGGTTCGACTTACTAGCGCCACTGAGTTCTT 

GAAGCCTTTGGGACCAAGGAAACTTCCCCCTGATCATCAAGAATCTTAAGATAGAAGACT 

361 + + + + + 420 

CTTCGGAAACCCTGGTTCCTTTGAAGGGGGACTACTAGTTCTTAGAAXTCTATCTTCTGA 
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CAGATACTTACATCTGTGAAGTGGAGGACCAGAAGGAGGAGGTGCAATTGCTAGTGTTCG 

421 + + + + + 480 

GTCTATGAATGTAGACACTTCACCTCCTGGTCTTCCTCCTCCACGTTAACGATCACAAGC 

GATTGACTGCCAACTCTGACACCCACCTGCTTC 
481 + + +... 

CTAACTGACGGTTGAGACTGTGGGTGGACGAAG 



or a degenerate variant thereof. 

9. The method of any one of claims 1 or 7» characterized in that said DNA sequence which encodes 
^ said fragment of C04 comprises the following DNA sequence: 
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CAATGAACCGGG 

-+ 120 

5 

GTTACTTGGCCC 

CAGTCCCTTTTAGGCACTTarrTCTGGTGCTGCAACTGKK:GCTCCTCCC^ 
121 + + +. + 180 

10 

CTCAGGGAAAATCCCTGAACGAAGACCACGACGTTGACCGCGAGGAGGGTCCTCGGTGAG 

AGGGAAAGAAAGTGGTGCTGGGCAAAAAAGGGGATACAGTGGAACTGACCTGTACAGCTT 
181 + + + + - 240 

'® TCCCTXTCTTTCACCACGACCCGTTTTTTCCCCTATGTCACCTTGACTGGACATGTCGAA 

CCGAGAAGAAGAGCATACAATTCCACTGGAAAAACTCCAACCAGATAAAGATTCTGGGAA 

241 + + + + + 300 

2° GCGTCffCTTCTCGTATGfTAAGGTGACCTTTfTGAGGTTCGT 

ATCAGGGCTCCTTCTTAACTAAAGCTCCATCCAAGCTGAATGATCGCGCTCACTCAAGAA 

301 + + + + + 360 

25 TACTCCCGAGGAAGAATTGATTTCCAGGTAGGTTCCACTTACTAGCCCGACTGAGTTCTT 

GAAGCCTTTGGGACCAACGAAACTTCCCCCTGATCATCAAGAATCTTAAGATAGAAGACT 

361 + ♦ + + + 420 

30 CTTCGGAAACCCTGGTTCCTTTGAAGGCGGACTAGTAGTTCTTAGAATTCTATCTTCTGA 

CAGATACTTACATCTGTGAAGTGGAGGACCAGAAGGAGGAGGTGCAATTCCTAGTGTTCG 

421 * + + + + 480 

35 GTCTATGAATGTAGACACTTCACCTCCTGGTCTTCCTCCTCCACGTTAACGATCACAAGC 

GATTGACTGCCAACTCTGACACCCACCTGCTTCAGGCGCAGAGCCTGACCCTGACCTTPG 
481 ---.+ + + + + 540 

40 CTAACTGACGGTTGAGACTGTGGGTGGACGAAGTCCCCGTCTCGGACTGGGACTGGAACC 

AGAGCCCCCCTGGTAGTAGCCCCTCAGTGCAATGTAGGAGTCCAAGGGGTAAAAACATAC 

541 + + + + + 600 

45 TCTCGGGGGGACCATCATCGGGGAGTCACGTTACATCCTCAGGTTCCCCATTTTTGTATG 

AGGGGGGGAAGACCCTCTCCCTGTCTCAG 

601 + ♦ 

50 TCCCCCCCTTCTGGGAGAGGCACAGAGTC 

or a degenerate variant thereof. 

10. The method of claim 7, characterized in that said host produces immuno-globulin heavy chains of 
55 the class igM* lgG1 and lgG3 together with said fusion protein to give an immunoglobulin-like molecule 
which binds to HIV-gp120. 
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